Hi I am Kelechi Ikegwu,

I am a Data Scientist at Volvo Group and a recent Ph.D. Graduate from the University of Illinois with 9 years of experience in machine learning, data visualization, and applied research. My dissertation is titled: “Machine Learning and Information-Theoretic approaches for Financial Applications”. My professional interest lie in: Data Science, Machine Learning, Optimization, Programming, Data Visualization, Art, and Game Development.


In the Projects tab you'll find a variety of projects/tools for one of (or a combination of) my interests/ research areas.


Finally in the Art tab you'll find photos, oil paintings, and 3D printed sculptures that I have worked on.

Languages

Python

95%

SQL

95%

PySpark

95%

HTML5

80%

R

80%

Javascript

80%

C#

80%

Selected Skills/Tools

Machine Learning

95%

Data Visulization

95%

Linux

95%

git

90%

Databricks

90%

MLFLOW

85%

Game Design

85%

Data Scientist

Volvo Group

2022 — Present

Worked with stakeholders throughout the organization to identify opportunities for leveraging company data to drive business solutions. Effectively implemented theoretical concepts and communicated them to stakeholders and other data scientists to improve the performance of existing data science initiatives. Leveraged Spark on Databricks to derive insights from billions of observations to increase the efficiency of building trucks in factories. Developed an end-to-end recommendation system to automate competitively pricing more than 13,000 aftermarket truck parts. Coordinated with different functional teams to improve the performance of machine learning models and scalability of code.

Lecturer

University of North Carolina at Greensboro

2022 — Present

Taught graduate and undergraduate computer science students a problem-based learning introduction to Data Science, including programming with data; version control; data mining, munging, and wrangling; statistics, analytics, visualization; and applied machine learning, directed towards scientific, social, and environmental challenges. [View Course Repo]

Research Fellow, Laboratory of Computation, Data, and Machine Learning

University of Illinois at Urbana-Champaign

2016 — 2021

As a Researcher, I developed Standard Machine Learning Language (SML) which attempts to automate parts of machine learning using SQL like queries. Used Machine Learning to predict the profitability of firms [View Paper]. Developed an open source implementation to estimate Transfer Entropy which is up to 1072 times faster than existing implementations [View Paper] [View Github Repo] . Used information theory measures along with Network analysis to identify/ discover characteristics of information flowing between financial entities.

PhD Candidate, University of Illinois at Urbana-Champaign

Dissertation Title - “Machine Learning and Information-Theoretic approaches for Financial Applications”.

2016 — 2021

I obtained the Illinois Distinguished Fellowship and GEM Associate Fellowship. As part of my studies I took 32 credits worth of graduate level courses in data science related areas for the equivalent of my masters degree. I then took an an additional 32 credits worth of graduate level courses for my PhD coursework. I am currently completing 32 credits worth of dissertation research. Relvant coursework is at the bottom of this webpage. I helped develop, lecture, and teach the Foundation of Data Analytics, Foundations of Data Science, and Advanced Data Science courses . Lastly, my research area lies in areas of Machine Learning, Finance, and Information Theory.

Intern, Airforce Research Laboratory

Summer 2016 — Summer 2016

As an intern I carried out experiments with CAFFE to determine the effect of using synthetic data when classifying objects in images. As evidence from the Large Scale Visual Recognition Challenge (ILSVRC) Deep Learning (DL) techniques have become the de-facto standard for labeling objects in images better than human-level performance however, large training datasets are required to achieve this performance. This makes it challenging to apply DL to small datasets and most likely in order to utilize DL for sparse datasets a reliance on synthetic training data will be required.

Intern, NASA Jet Propulsion Laboratory, Pasadena CA

Summer 2015 — Summer 2015

As an intern I investigated circumstellar matter in young and dying stars which potentially can aid the NASA Origins program in understanding the lifecycle of stars. I created an Application Program Interface (API) and pipeline to extract Far and Near ultraviolet (FUV and NUV) properties from Asymptotic Branch (AGB) Stars, computed summary and variability statistics, and developed a plotting program to view parameters from extracted AGB stars. AGB stars with high FUV/NUV emission and variability lead to signatures of Accretion¬ related phenomena around the companions of an AGB Star. FUV variability is a strong indicator of X-¬Ray Emission. Thus our study enables us to generate a candidate list of AGB Stars for an X ¬Ray Study.

Intern, NASA Ames Research Center, Silicon Valley

Summer 2014 — Summer 2014

As an intern I assessed a biologically inspired machine-learning algorithm called NuPIC with data from a new "Green" building known as "Sustainability Base" at NASA Ames Research Center. I employed this advanced machine learning algorithm and other statistical methods to detect adverse events in Sustainability Base’s data.

Student, North Carolina A&T State University

BS in Information Technology, Minor in Applied Mathematics

2012 — 2016

I obtained the NASA MUREP Scholarship and took a variety of Mathematics, Science, Information Technology, and Computer Science courses.


If you would like to view a more comprehensive summary of my previous work experience you can view my my Resume.


Relevant Course Work

[INFO 490 RB] Foundations of Data Science

A

[INFO 490 IT] Information Trust

A

[INFO 490 RB2] Advanced Data Science

A

[CS 598RK] Data Driven Design

A

[IE 522] Statistical Methods in Finance

B

[INFO 403] Game Design: Virtual Worlds

A

[IS 590MD] Methods for Data Science

A

[IS 588] Research Design in Information Science

A

[IS 542] Data, Statistical Models, and Information

B

[CS 491 TC] Tradecraft for Coders

Pass

[ITT 420] Introduction to Unix/Linux

A

[ITT 430] Linux Systems Administration

A

[ITT 325] Intro to Computer Database Management

A

[CST 325] Computer Database Management II

A

[MIS 352] Object-Oriented Programming

A

[COMP 280] Data Structures

A

[COMP 285] Analysis of Algorithms

A

[MATH 224] Intro Probability & Statistics

A

[MATH 450] Linear Algebra & Matrix Theory

A

[MATH 608] Methods of Applied Statistics

A

You have reached the end of this page!