Rithika
Florian Johnson

MSCS Graduate at
University of Massachusetts Amherst


DOWNLOAD RESUME

About Me

I am passionate about the dynamic intersection of data science, natural language processing (NLP), and machine learning. With a keen interest in exploring diverse methodologies to tackle complex problems, I thrive on experimenting with data and leveraging innovative approaches to drive insightful solutions. My journey in data analysis and NLP has been enriched by hands-on experience in diverse projects, including the semantic profiling of mass shooter narratives and enhancing content categorization systems. These projects not only sharpened my technical skills in Python, PyTorch, and advanced statistical techniques but also deepened my appreciation for the nuanced art of data-driven decision-making. In my professional experience at S&P Global, I honed my abilities in data analytics and process automation, significantly reducing production timelines and enhancing data accessibility. This further solidified my passion for developing robust systems, particularly in database management and migration, using tools like Snowflake and SSMS. I am also actively engaged in academic research, with publications exploring topics such as generating abstract art from hand-drawn sketches using GAN models and predicting COVID-19 cases using survival analysis and LSTM models. These experiences have not only contributed to my academic growth but also fueled my curiosity to explore the frontiers of machine learning applications. Looking ahead, I am eager to continue pushing boundaries in data science, integrating new technologies, and contributing to innovative solutions that make a tangible impact.

Work Experience

Data Analyst, S&P Global India
Aug 2022 - Jul 2023

  • Analyzed data and resolved issues in crop science and internal projects, implementing automated solutions that reduced data production time by 4 months.
  • Leveraged Alteryx Designer, Tableau, and Python for daily data analytics tasks, ensuring prompt insights to clients.
  • Developed Python code for forecasting values in animal produce datasets, previously a manual process, accelerating results by 1 month.

R&A, Data and Technical Design Intern, S&P Global India
Mar 2022 – Jul 2022

  • Facilitated the smooth migration of data from SSMS to Snowflake by creating views and stored procedures, enhancing data accessibility and performance.
  • Enhanced team efficiency by at least 40% through the creation of Python tools and process automation with VisualCron, streamlining workflows.

Projects

Interactive Music Discovery and Artist Promotion System

Interactive Music Discovery and Artist Promotion System

Location: Amherst, USA

Date: November 2023

Built a scalable distributed system with MongoDB Atlas, integrating Spotify and YouTube APIs for artist recommendations and interactive music games, impacting over 500,000 users.

Applied audio analysis (pydub, librosa) and a custom popularity score to enhance emerging artist exposure by 80%, using a balanced metric of likes, views, shares, and follower counts.

Developed secure REST APIs in Python, enforced rate limits, used response caching, and maintained a microservices architecture, ensuring over 90% uptime and robust performance.

Semantic Profiling of Mass Shooter Narratives using NLP

Semantic Profiling of Mass Shooter Narratives using Natural Language Processing

Our research develops a fine-tuning strategy for large language models (LLMs) to detect violent tendencies in social media comments, crucial for identifying and mitigating potential mass shooting threats. We utilized the Mistral-7B model from Unsloth and BERT, exploring fine-tuning methodologies and an ensemble model approach to enhance performance in recognizing traits like terrorism, supremacism, and suicidal thoughts.

Using crossentropy as our loss function and role prompting with Mistral-7B, we improved trait identification. By fine-tuning Mistral-7B on violent tendencies, we created a specialized model capable of detecting subtle cues. Applying zero-shot chain-of-thought prompting on mass shooter manifestos further enhanced the model's ability to draw accurate conclusions from complex information.

Conducting a comparative study using PyTorch, we achieved a 20% enhancement in overall model performance and a 10% improvement in accuracy by implementing an ensemble model, surpassing individual multistage fine-tuned models.

Enhancing Content Categorization Systems

Enhancing Content Categorization Systems

Location: Amherst, USA

Date: November 2023

Engineered a personalized content recommendation system by integrating LaMP architecture with a finely-tuned Flan-T5-base language model.

Curated a diverse dataset by combining LaMP News Categorization, AG News, and Book Depository datasets, optimizing for both diversity and computational efficiency.

Achieved a noteworthy 8% increase in model accuracy and 0.1 increase in F1 score by integrating a diverse Book Depository dataset, showcasing the model’s adaptability and effectiveness in categorizing content across varied domains.

Generating Abstract Art from Hand-Drawn Sketches using GANs

Generating Abstract Art from Hand-Drawn Sketches using GANs

Location: Bangalore, India

Date: June 2022

Implemented CGAN, Cycle GAN, and Pix2Pix GAN models for transforming hand-drawn sketches into captivating abstract art.

Evaluated various image sharpening techniques, including Highboost Filter and Laplacian filter, to enhance model performance.

Pioneered creativity by developing a filter applicable in platforms like Instagram, providing a unique avenue for artistic expression.

Publications

Generating Abstract Art from Hand-Drawn Sketches using GANs

Generating Abstract Art from Hand-Drawn Sketches using GANs

Publication: Springer

Date: June 2023

Authors: Chakrabarty, S., Johnson, R.F., Rashmi, M., Raha, R.

This paper explores the use of Generative Adversarial Networks (GANs) to create abstract art from hand-drawn sketches. The research was part of the Proceedings of International Joint Conference on Advances in Computational Intelligence (IJCACI 2022) and provides insights into the intersection of art and artificial intelligence.

Read more
Predicting the Number of New Cases of COVID-19 in India

Predicting the Number of New Cases of COVID-19 in India

Publication: IEEE Xplore

Date: December 2021

Authors: A. S., Johnson, R. F., R. k. N., M. T R., and V. V.

This study utilizes Survival Analysis and Long Short-Term Memory (LSTM) models to predict the number of new COVID-19 cases in India. Presented at the 2021 Fifth International Conference on I-SMAC (IoT in Social, Mobile, Analytics, and Cloud), it introduces a new metric that significantly enhances the model's accuracy, offering a comprehensive analysis of pandemic trends using advanced computational methods.

Read more