Some of my selected projects

A collection of my work

PharMe: A Pharmaceutical Informed LLM

This study investigated the feasibility and effectiveness of creating a state of the art, condition-aware large language model (LLM), nicknamed PharMe.

Utilizing Medical LLaMA-3-8B, pre-tuned on the MIMIC-III dataset and further fine-tuned with data from the FDA's Drugs@FDA database and Purple Book, PharMe was able to create a solution for healthcare providers with actionable insights into traditional,cost-effective treatment options, including biosimilars and generics, that they may have otherwise overlooked or been unaware of, for any given diagnosis

PharMe demonstrated the potential of large language models in healthcare by addressing potential gaps in pharmaceutical knowledge dissemination. By combining some of the latest models in AI technologies with healthcare domainspecific data, PharMe supports healthcare providers, not replacing them, in delivering informed, cost-effective, and highquality care.

PharMe.Doctors Empowered

Keywords : Apache Hadoop, Apache Spark, TensorFlow, Numpy, Pandas, Google Cloud Platform, HDFS, Spark Streaming, Kafka, Transformers

Efficient Transformations in Deep Learning Convolutional Neural Networks

This study investigates the integration of signal processing transformations—Fast Fourier Transform (FFT), Walsh-Hadamard Transform (WHT), and Discrete Cosine Transform (DCT) within the ResNet50 convolutional neural network (CNN) model for image classification.

The primary objective is to assess the trade-offs between computational efficiency, energy consumption, and classification accuracy during training and inference. Using the CIFAR-100 dataset (100 classes, 60,000 images), experiments demonstrated that incorporating WHT significantly reduced energy consumption while improving accuracy.

Specifically, a baseline ResNet50 model achieved a testing accuracy of 66%, consuming an average of 25,606 kJ per model. In contrast, a modified ResNet50 incorporating WHT in the early convolutional layers achieved 74% accuracy, and an enhanced version with WHT applied to both early and late layers achieved 79% accuracy, with an average energy consumption of only 39 kJ per model. These results demonstrate the potential of WHT as a highly efficient and effective approach for energy-constrained CNN applications

The project is not for public release

Hierarchical Voting-Based Feature Selection and Ensemble Learning Model Scheme for Glioma Grading with Clinical and Molecular Characteristics with SMOTE and Decision Threshold Adjustment

Keywords : Deep Learning, Convolutional Neural Networks (CNNs), ResNet50, Signal Processing Transformations, Walsh-Hadamard Transform (WHT), Fast Fourier Transform (FFT), Discrete Cosine Transform (DCT), CIFAR-100 Dataset, Energy Consumption, Frequency-Domain Analysis, Power Consumption, TensorFlow and Keras, GPU Utilization 

Accurate grading of gliomas is essential for effective treatment planning and improved patient outcomes. This study presents a novel framework that combines hierarchical voting-based feature selection with ensemble learning to achieve robust classification of gliomas using molecular and clinical data.

The method leverages diverse feature selection techniques alongside soft-voting ensemble models to identify the most impactful predictors while reducing redundancy and computational overhead. Evaluations using the TCGA and CGGA datasets highlight superior accuracy and cost-effectiveness compared to standard approaches like LASSO.

By integrating advanced methodologies, this approach not only enhances classification reliability but also offers scalability for broader applications in biomedical data analysis. These findings underscore the potential for this framework to drive improved decision-making in clinical oncology and beyond.

The project is not for public release

Keywords : Glioma Grading, Brain Tumor Classification, Hierarchical Voting, Ensemble Learning, Feature Selection, Clinical-Molecular Predictors, Borderline SMOTE, Decision Threshold Adjustment, TCGA Dataset, CGGA Dataset, Soft Voting Scheme, Hard Voting Feature Selection, Random Forest, Recursive Feature Elimination (RFE), LASSO Regression, Principal Component Analysis (PCA), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), AdaBoost Classifier, Cross-Validation

Our project uses machine learning techniques to build a model to classify companies into investment grade (IG) or non-investment grade (non-IG) with financial metrics.

Different types of investors have their investment strategies, often centered around the credit quality of the firms in their portfolios. For instance, insurance companies and pension funds allocate most of their funds to investment-grade credit, while hedge funds lean towards high-yield.

As a company’s credit rating evolves with financial performance, we aim to build a tool that enables users to leverage financial data to predict whether a company will fall into the investment-grade or non-investment-grade bucket. With the machine learning model we curated, investors can anticipate any significant changes to a company’s credit rating by using forward-looking financial metrics.

Predicting Corporate Creditworthiness with Machine Learning


Low-Cost Through-the-Wall Human Detection and Localization

Designing ultra-wide band (UWB) radar and, integration tool incorporates a range of components, including a transmitter, receiver, and an UWB pulse generator. It uses millimeter wave-length (mmWave) radiation, part of the electromagnetic spectrum with wavelengths typically in the range of 8 GHz to 20GHz, has specific properties when it comes to penetration through walls and other solid objects.

With utilizing the Doppler effect, it can still be observed with waves (including millimeter waves) that propagate through or around walls. It can be used to perform an evaluation of their potential applications in medical uses, and apart from life-saving operations, it can also be used in the military to see enemies behind walls. The application of the device is aimed at detecting the locations of individuals trapped under debris in catastrophic events such as earthquakes, with the objective being to maintain a stance of objective benevolence.