Toni Cebrián, Developer in Barcelona, Spain
Toni is available for hire
Hire Toni

Toni Cebrián

Verified Expert  in Engineering

Machine Learning Developer

Location
Barcelona, Spain
Toptal Member Since
February 4, 2019

A rare mixture of data scientist and data engineer, Toni is able to lead projects from conception and prototyping to deploying at scale in the cloud.

Portfolio

D5.ai
Google Cloud, NEO, Crypto, Python, Scala, Data Science, Recommendation Systems...
Coinfi
PubSubJS, Data Flows, Apache Beam, Python, Apache Airflow, Data Science...
Stuart
Akka, Redshift, Apache Kafka, Apache Airflow, Scala, Python, Data Science...

Experience

Availability

Part-time

Preferred Environment

Linux

The most amazing...

...experience has been teaching a typeclasses talk using Scala at a local Scala meetup group.

Work Experience

Founder

2019 - PRESENT
D5.ai
  • Ingested the Bitcoin graph into a Neo4J database using Airflow to periodically crawl BigQuery tables with bitcoin transactions.
  • Created asyncio web crawlers in Python to scrape websites with newsworthy content.
  • Maintained and evolve an SDK in Scala and Haskell for accessing web APIs from customers using those languages.
Technologies: Google Cloud, NEO, Crypto, Python, Scala, Data Science, Recommendation Systems, Data Engineering

Lead Data Engineer

2018 - 2018
Coinfi
  • Created the ETL orchestration systems using Airflow with Composer in Google Cloud.
  • Created scrapping services for getting Crypto data (prices, events, news.) to ingest into the platform.
Technologies: PubSubJS, Data Flows, Apache Beam, Python, Apache Airflow, Data Science, Recommendation Systems, Data Engineering, Web Scraping

Head of Data Science

2016 - 2018
Stuart
  • Designed the company's data warehouse using Redshift.
  • Created a forecasting model for predicting drivers login into the platform and deliveries to be served.
  • Architected an event sourcing system for complex event processing.
  • Deployed a route optimization algorithm for picking drivers based on route and package size.
  • Created the data science team from scratch.
Technologies: Akka, Redshift, Apache Kafka, Apache Airflow, Scala, Python, Data Science, Machine Learning, Data Engineering, Artificial Intelligence (AI), Natural Language Processing (NLP)

Chief Data Officer

2014 - 2016
Enerbyte
  • Architected the infrastructure for ingesting data from IoT devices.
  • Researched algorithms for energy disaggregation from a single point of measure.
  • Created the data science team from scratch.
Technologies: Apache Kafka, Spark Streaming, Spark, Scala, Python, Data Science, Machine Learning, Data Engineering, Artificial Intelligence (AI), Natural Language Processing (NLP)

Head of Data Science

2012 - 2014
Softonic
  • Created a recommender system based on textual content from app reviews.
  • Developed an improved search engine using machine learning and Solr.
  • Created the data science team from scratch. Hired all relevant profiles and set up the OKRs and managerial tasks.
Technologies: Semantic Web, RDF, Word2Vec, Solr, Recommendation Systems, Spark, Hadoop, Scala, Python, Data Science, Machine Learning, Data Engineering, Artificial Intelligence (AI), Natural Language Processing (NLP)

Typeclasses Talk

https://github.com/tonicebrian/typeclasses-talk
At my local Scala meetup, I've taught a typeclasses talk using Scala

SGF Parser in Haskell

https://github.com/tonicebrian/sgf
I'm the current maintainer of SGF library in Haskell.

Languages

Python, Python 3, Scala, SQL, RDF, Haskell, C++, Java

Frameworks

Spark, Akka, Hadoop

Libraries/APIs

Spark Streaming, Pandas, Scikit-learn, NumPy, PubSubJS, Python Asyncio, TensorFlow, XGBoost

Tools

Apache Airflow, Cloud Dataflow, Apache Beam, Solr, Apache Avro

Paradigms

Functional Programming, Data Science, Reactive Programming

Other

Machine Learning, Akka HTTP, Data Mining, Data Engineering, Artificial Intelligence (AI), Crypto, NEO, Data Flows, Recommendation Systems, Word2Vec, Semantic Web, Web Scraping, Natural Language Processing (NLP), Deep Learning, Financial Modeling, Monte Carlo Simulations

Platforms

Apache Kafka, Linux

Storage

Redshift, Cassandra, Google Cloud, Redis

2009 - 2012

Master's Degree in Artificial Intelligence

Universitat Politecnica de Catalunya - Barcelona, Spain

2009 - 2011

Postgraduate Degree in Quantitative Techniques for Financial Products

Universitat Politecnica de Catalunya - Barcelona, Spain

MAY 2012 - PRESENT

Cloudera Certified Hadoop Professional

Cloudera