Hello there! I'm
Cristobal

Projects Resume

About me

  • Full Name: Cristobal Zamorano Astudillo
    • First Name: Cristobal
    • Last Name: Zamorano Astudillo
  • Type:
  • Education: Undergrad @ UC Berkeley
  • Hobbies: Running 🏃‍♂️ , NBA 🏀 , One Piece 🏴‍☠️ , drinking yerba mate (REAL ONE), collecting yerba mate gourds 🧉 , cooking (pandemic skill unlocked!)
  • Fav. Music: Check my Spotify
  • Languages: English, EspaƱol
  • Favorite Food: Cali style Burrito 🌯 , Chicago Style (Deep Dish) Pizza 🍕 , Lasagna, Texan Brisket, and Completo, Churrasos, cualquier cosa Italianos
  • Dislikes: Melted ice cream 🍨, violence

Projects

Full Fletched Social Media Back End with FastAPI (ON GOING)

Back End clone for most Social Media platforms that are based on the Vote Theory. Key features:

  • Get, create, delete user posts
  • Give "likes" to certain posts
  • User Authentication
  • Database storage of users and posts
  • Data migration

Technologies Used: Python, FastAPI, JWT, Postgres Database, SQLAlchemy ORM, Pydantic Data Models

View Code on Github


Sentence2Vec

Final Paper of research on using word-embeddings to improve students learning using Machine Learning.

Code cannot be shared due to data confidentiality and sensitivity.

Technologies Used: Python, Keras, gensim, Pandas.



Youtube Videos Data ETL using AWS

Using AWS Athena, Glue, Lambda, Spark, I created a fully functional data catalog using AWS technologies.

Using Youtube video data from a Kaggle commpetition, the data catalog contians raw, queriable, and processed (transformed) data. The user can run MySQL queries to retrieve data and then move data to processed data and display graphs and charts for visualization.

Technologies Used: Python, AWS

View Code on Github


Future Salary App

Scrapped more than 20,000 data and tech related jobs in main cities of the United States to create a Salary Predictor using NLP and Machine Learning Algorithms.

I used Beautiful Soup and Selenium to first retrieve text data and then automatize the process. The metadata contained features such as name, location, description, and others. Based on the exploratory analysis of my data, I created custom salary buckets to target the regression prediction.

Technologies Used: Python, Scikit-Learn, PyTorch, gensim, Heroku, Streamlit.

Link to App View Code on Github


Toxic Text Dectector App

Using a previous Kaggle Competition data and advanced Machine Learning algorithms, I created an Mutilabel Predictor of toxic text data.

Technologies Used: Python, Seaborn, Streamlit, Flask, Heroku, NLTK, Scikit-Learn.

Link to App View Code on Github


Spotify Models
Spotify Project Part 3 - Classification Modeling and Comparison

I used the Logistic Regression as the based model to compare with other Machine Learning Classification models. This part of the project was an opportunity for me to learn more about other Machine Learning Models that I didn't know before.

Technologies Used: Python, Scikit-Learn, XGBoost and LightGBM imports, Spotify's API (Spotipy).

View on NBViewer View Code on Github


Spotify Project Part 2 - EDA & PCA Construction

We continue from Part 1 to proceed to explore the streaming history dataset and their Spotify's features. In this part of the project we focus more in what is the internal sense of the dataset.

Also, I extract other playlists such Weekly Discover and my Liked Songs playlists that we later used for Machine Learning modeling purposes.

I used Seaborn and Plotly to explore the data through an incisive EDA. Then, discuss the idea behind the PCA algorithm for dimensionality reduction and proceed to construct the PCA algorithm from scratch.

Technologies Used: Python, Numpy, Seaborn, Plotly, Spotify's API (Spotipy).

View on NBViewer View Code on Github
Distrbution of Dancebility feature songs PCA Demo


Screenshot Spotify Song extractions 1 Screenshot Spotify Song extractions
Spotify Project Part 1 - Spotify Features & Data Extraction

First part of a series of 3 parts series using my own data from Spotify. I requested Spotify to send me all the data that they could collect from my user account.

Here I used my year-long streaming history that I requested to add each song the Spotify's features that the music platform uses for their own recommendation algorithms. I wrote a Python script to interact with Spotify's API and extract the important information.

Technologies Used: Python, Spotify's API (Spotipy).

View Code on Github


Santiago's Accident Prediction Classifier

This is the final project of the Graduate version of the class Data 100/200: Principles of Data Science at Berkeley.

I wanted to explore my homecity, Santiago, Chile, casualties in vehicle accidents data and construct a Classifier that may predict when and what streets are more likely to cause deathly accidents. The goal here is to help authorities to use better their resources and create policies based on data to help to prevent life losses.

Data was obtained from various places such as the Chilean Congress Library, OpenStreetMap, and official Chilean police data reports of accidents.

Technologies Used: Python, GeoPandas, Seaborn, Scikit-Learn

View on NBViewer View Code on Github
Santiago's Accidents Screenshot


Berkeley map Screenshot
BearMaps + Trie Data Structure

This is one of the assingments in another Berkeley course, CS 61B: Data Structures. In this project I did an extra function to the project which it creates a Trie Data Structure from scratch.

Unfortunately, I can't share the code publicly because of the class's policy.

Technologies Used: Java



This Website

I wrote this website from scratch. I always wanted to have my own website. I hope you like it! :)

Technologies Used: HTML, CSS, Javascript, Bootstrap

View Code on Github
Berkeley map Screenshot

Contact Me

My email is cristobal.zamorano at berkeley.edu

Check my Resume here

Or send me a message below ↴