Twitter Account Classification
❮ ❯   
Built with: Python (pandas, scikit-learn, NumPy)
- Executed comprehensive data pre-processing on Twitter datasets, including stratified splitting to preserve account type distribution, and developed the cleanTweet function to improve text quality for model input.
- Employed a bag of words model with logistic regression for account classification, fine-tuned parameters for optimal performance, and analyzed keywords to distinguish between human and non-human accounts.
- Check it out on GitHub
Precision-Tolerant Database System (Capstone project)
❮ ❯   
Built with: MySQL
- Research into flexible database systems that accommodate imprecise data, challenging traditional relational models to improve data retention and query accuracy while maintaining integrity.
- Developing precision-tolerant database systems that retain imprecise data violating Numerical Conditional Constraints, extending existing DBMSs to improve data retention and query handling while exploring cost-saving benefits and analytics enhancements.
- Check it out on GitHub
Ecommerce Data Insights
❮ ❯   
Built with: dbt, Snowflake, Apache Airflow, SQL
- Comprehensive hands-on project integrating dbt (data build tool) with Snowflake, focusing on advanced data transformation and deployment techniques to enhance data pipeline efficiency and reliability.
- Key steps executed: environment setup, dbt configuration, model creation and transformation, implementation of macro functions and testing protocols, culminating in Airflow-based model deployment for streamlined data workflows.
- Check it out on GitHub