portfolio

My Portfolio - Prince Nwaekwu

Github profile : https://github.com/dthatprince

Welcome to my portfolio! This repository showcases my projects and fun stuffs.

About Me

I am Prince Nwaekwu, A Python Backend Engineer with a strong background in Data Science and Machine Learning. I specialize in building scalable backend systems using Flask, FastAPI, and Django, and creating interactive frontends with Vue.js. My expertise includes developing RESTful APIs, integrating databases (MySQL, PostgreSQL, MongoDB), and deploying cloud-based solutions on AWS and Azure.

With a foundation in data-driven problem solving, I bring analytical depth to backend development, optimizing performance, automating workflows, and ensuring clean, maintainable code.

I’m passionate about creating intelligent, full-stack solutions that bridge AI-powered insights with robust software engineering and modern frontend experiences.

I have a proven track record in deploying machine learning models, enhancing data workflows with MLOps, and building robust APIs. My technical skills are complemented by my ability to communicate effectively across technical and non-technical stakeholders, making complex concepts accessible and actionable.

I am committed to continuous learning and applying cutting-edge technologies to solve real-world problems. I am keen to contribute to projects that leverage Python to drive innovation and deliver value.

Let’s explore the possibilities of data and Python together!

Thank you for taking the time to learn more about me and my work!

Email: nwaekwuprince@gmail.com
LinkedIn: Profile
Tableau: Dashboards

Projects

HR Streamline AI Assistant & API

A Human Resource Management System (HRMS) that automates employee data management, leave tracking, and core HR processes. It features an AI assistant that allows querying the database in natural language, eliminating the need for manual SQL queries during analysis. Repository: Github Link
Live Demo
AI Assistant for reporting: https://aihrstreamline-dthatprince.streamlit.app/
API & Documentation: https://dthatprince-hrstreamline-api.onrender.com/

FinChain Advisor App

This application allows users to query financial data efficiently using natural language. This project demonstrates how to integrate a ChromaDB vector store within a LangChain pipeline to build an AI-powered GPT Investment Advisor Q&A agent for querying and analyzing financial data. Repository: Github Link

Data anonymization for LLMs with OpenAI, LangChain and Microsoft Presidio

Developing and Exploring Use cases for Reversible Anonymization, Multi-language Anonymization, Question and Answering with privacy protection. Repository: Github Link

Creating GCP Infrastructure with Terraform

Description: This project demonstrates the creation of Google Cloud Platform (GCP) infrastructure using Terraform. It specifically sets up a Google Storage Bucket and a BigQuery dataset in the europe-west1 region with the help of a service account for authentication. Repository: Github Link

Image Object Detection using ImageAI

Description: Image Object Detection project using ImageAI with RetinaNet, YOLOv3, or TinyYOLOv3 models on GitHub.
Repository: GitHub

Anomaly Detection in Time Series

This project explores different methods for anomaly detection in time series data, specifically focusing on CPU utilization data from an AWS EC2 instance. The following techniques are implemented and compared: Mean Absolute Deviation (MAD), Isolation Forest & Local Outlier Factor (LOF)

Repository: GitHub

PySpark ML Pipeline for Bank Term Deposit Subscription Prediction

Description: A PySpark project implementing a machine learning pipeline for predicting term deposit subscriptions using the Bank Term Deposit Subscription Dataset. Using certain factors we need to classify whether or not a customer subscribes to the term deposit upon getting a call from a bank’s representative.
Repository: GitHub

Docker-Based Data Ingestion Pipeline

A Docker-based data ingestion pipeline that automates the extraction, transformation, and loading (ETL) of data into a PostgreSQL database. It leverages Docker to containerize and manage the deployment of a Python script, a PostgreSQL database, and PgAdmin, ensuring seamless interoperability and isolation in a single Docker network.
Docker, Python, PostgreSQL, PgAdmin
Repository: Github

Multivariate Anomaly Detection Using the UCI Thyroid Disease Data Set

This project demonstrates a comprehensive approach to multivariate anomaly detection using Python within a Jupyter Notebook. It guides users through loading, preprocessing, training, and evaluating models on the UCI Thyroid Disease dataset. Repository: Github Link

Univariate Anomaly Detection (Machine Learning Methods)

This project implements univariate anomaly detection using machine learning techniques in Python, within a Jupyter Notebook environment. It focuses on detecting unusual patterns or outliers in single-variable datasets using advanced machine learning methods. Repository: Github Link

Euro 2024 Parliament Elections: LSTM Text Prediction Model

Description: This project utilizes an LSTM (Long Short-Term Memory) model to perform text completion and prediction. The training data is sourced from the Wikipedia page about the 2024 European Parliament elections. The model is designed to predict the next word in a given sequence of text, demonstrating the capabilities of LSTM networks in handling sequential data.
Repository: GitHub

GPT-2 Text Generator App

Description: This project is a text generator application built using GPT-2, Gradio, and TensorFlow. The app takes a user-provided sentence as input and generates a complete paragraph based on that sentence.
Repository: GitHub

AI Blog Post Generator using LLAMA2

Description: This project is an AI-powered blog post generator using the LLaMA-2 model. The application is built with Streamlit and leverages the LLaMA-2 model to generate blog posts based on user inputs.

Features:

Generate blog posts tailored to different job profiles (Researchers, Data Scientists, AI Enthusiasts).
Customize the length of the blog post.
Easy-to-use web interface built with Streamlit
Repository: GitHub

Data Analysis and Personalized Offers for an Effective Marketing Strategy (Completing Soon - this August 2024)

Description: Leveraging Personalized Offers and Data Analysis for an Effective Marketing Strategy.
Repository: GitHub

Pricing under Uncertainty - Implied Volatility Model - Risk-Free Interest Rates and Back-Testing (Completing Soon - this August 2024)

Link is Unavailable right now.

Credit Risk Modeling and Credit Scoring For Financial Lending

Description: This project offers a comprehensive work on building state-of-the-art credit scoring models tailored specifically for financial lending in the Fintech industry. Focused on credit risk modeling and understanding the role of data science in the lending industry.
Repository: GitHub

Forecasting Views for a Medium article using FB Prophet - Time Series

Description: Forecasting Views for a Medium article using FB Prophet.
Repository: GitHub

Analyzing Patient’s Medical Appointments

Description: Exploring Factors Associated with No-Show Appointments in Medical Settings
Repository: GitHub

Machine Learning Models Analysis using Yellowbrick Visualizers and API

Description: This is my API documentation for Yellowbrick Visualizers and API! It contains various production-ready visualizers along with code examples of how to use them.
Repository: GitHub

Analyzing Streaming Service Content with SQL

May 2022 - May 2022May 2022 - May 2022 Exploring the dynamic world of streaming service content analysis using SQL. The focus was on major players in the streaming industry: Netflix, Amazon, Hulu and Disney+.

Project link: DataCamp

Player Retention A/B Testing for Mobile Game

Description: The goal of this project is to conduct an A/B testing and player retention analysis on the Cookie Cats game dataset to investigate the impact of certain factors on player retention in a game. Player retention is a crucial metric in the gaming industry, as it directly correlates with the success and growth of a game. By understanding the factors that influence player retention, game developers can make informed decisions to optimize player engagement and enhance the overall gaming experience.
Repository: GitHub

Text Mining - Text Analysis, Classification, and Clustering (R & Python)

Description: This project focuses on applying text analysis, classification, and clustering methods to a real-world corpus using R and Python. The goal is to clean the data, perform statistical analysis, create a classification model, and apply a clustering method.
Repository: GitHub

Yelp Business Reviews Sentiment Analysis

Description: Understanding how your customers or users feel about your business.

Leveraging BeautifulSoup and Python to get that data from the Web, analyzing business reviews, cleaning text data and reviews to get meaningful insights, Lemmatization and calculating sentiment using TextBlob.sentiment()

Repository: GitHub

Global Terrorism Data Analysis and Prediction, Model Deployment using Streamlit

Model Deployment: GitHub
Repository: GitHub

Recommendation System with Python, Machine Learning and AI

Description: This repository contains the implementation of various recommendation systems using Python and machine learning techniques. The goal of these systems is to provide personalized recommendations to users based on their preferences and historical data.

The implemented recommendation systems include:

Classification-based Collaborative Filtering Systems - Logistic Regression
Content-Based Recommenders - Nearest Neighbors Algorithm
Correlation-Based Recommenders
Model-based Collaborative Filtering Systems with SVD Matrix Factorization
Popularity-Based Recommenders
Repository: GitHub

Automation with Python

Using Python for Automating:

Reading and writing files
Organizing directories
Web scraping with Beautiful Soup
Automating web browsing with Selenium
Automating with APIs
Creating API requests
Linking API calls
Repository: GitHub

Human Resource Data Analysis & Churn Prediction

Description: An analysis of employee data in the dataset “HR_comma_sep.csv” to find out what contributes to employees leaving the company.
Repository: GitHub

Data Analysis of Uber Trips

Repository: GitHub

University Degree - Gender Gap Analysis

Description: Analysis of the Gender Gap in University Degrees (STEM)
Repository: GitHub

House Price Prediction Using TensorFlow - Deep Learning

Repository: GitHub

Customer Segmentation using KMeans - Python

Repository: GitHub

NLP Project - Chatbot Using NLTK

Description: Built a Chatbot to answer questions concerning global warming. (Python & NLTK)
Repository: GitHub

Netflix Movies & TV Shows Analysis

Description: Performing exploratory data analysis(EDA) to confirm if the average duration of movies has been declining.
Repository: My DataCamp WorkSpace

Google Play Store App and Reviews - Data Analysis and Visualization

Description: Analysis & Visualization of Google Playstore Data. The Play Store Apps Data has the potential to drive App-making businesses to success. Actionable insights can be drawn for developers to work on and capture the Android market.
Repository: GitHub
Repository: My DataCamp WorkSpace

Analyzing the Decline of Movie Durations on Netflix (2011-2020)

Description: In this project, I aimed to explore the trend of declining movie durations on Netflix from 2011 to 2020. Using a CSV file containing Netflix data and Python’s pandas library, I created a DataFrame and analyzed the average movie durations over the years. I also visualized the data using Matplotlib and Seaborn libraries to better understand the trend and its possible causes. By the end of the project, I had gained insights into the changing landscape of movie durations on Netflix and developed my data manipulation and visualization skills using Python.
Repository: My DataCamp WorkSpace

Analyzing the Scala Programming Language Repository

I analyzed the development history of the Scala programming language by reading, cleaning, and visualizing data from its real-world project repository.

The dataset included information about pull requests and the files that were modified by each request, which were previously mined and extracted from GitHub. I used Python to read, clean, and visualize the data to identify the individuals who had the most influence on Scala’s development and determine the language experts.

I also explored trends in the development of Scala over time to gain insights into its evolution as an open-source project. By the end of the project, I had developed my data manipulation, visualization, and analysis skills using Python and gained a better understanding of the development of the Scala programming language.

Repository: My DataCamp WorkSpace

Contact Me

Feel free to reach out to me for collaboration or any inquiries:

Email: nwaekwuprince@gmail.com
LinkedIn: Profile

Thank you for visiting my portfolio!