Session deeptech Scikit-learn, la boîte à outils de l’apprentissage automatique niveau 2 les 21 et 22 avril

Session deeptech CGAL, la bibliothèque d’algorithmes géométriques niveau 1 le 29 avril

Session état de l’art Le cerveau, un modèle pour l’intelligence artificielle le 30 avril – NOUVEAU !

Session état de l’art La cybersécurité des systèmes de contrôle industriel le 7 mai

Session état de l’art La sécurité informatique face aux menaces quantiques le 19 mai

Toutes nos formations
apprentissage automatique génie logiciel Intelligence artificielle

skrub like a pro: clean, prepare, and transform your data faster

 Module deeptech   NEW 
skrub is a Python package that bridges between dataframe libraries like Pandas, and machine-learning libraries such as scikit-learn. This course shows how skrub simplifies going from raw data to machine learning models by introducing the main features of the library. ©Inria / Photo C. Morel

Session:

Aucune session disponible actuellement.

Contactez-nous !

Objectifs

  • Understand the main features of skrub and how it can slot in a typical pipeline for machine-learning algorithms.
  • Learn the use cases for skrub transformers, and how they can simplify certain data preparation tasks.
  • Apply and combine the objects provided by the skrub library to various scenarios.

Pré-requis

  • Basic Python programming.
  • Basic Pandas and scikit-learn knowledge.
  • Experience with Jupyter notebooks is helpful, but not required.
  • Scikit-learn beginner course is a the pre-requisite for this course (for people who do not have a basic knowledge of scikit-learn).

Programme

This introductory course will cover the main features of skrub by building a full machine-learning pipeline from data exploration to model training. It will describe different use cases where the skrub API can be used to lessen the load on the user by simplifying common data preparation operations. The course will include a high level overview of the library with practical explanations and examples, time slots will be dedicated to exercise the notions learned.

SKILLS YOU’LL GAIN:
  • Explore and diagnose tabular data with TableReport.
  • Clean and engineer features using skrub transformers.
  • Build end-to-end scikit-learn pipelines with skrub and compare performance to baseline setups.

Intervenant(s)

  • Riccardo Cappuzzo

    Riccardo Cappuzzo

    Riccardo Cappuzzo holds a dual master’s degree: one in Computer Systems Security (Telecom Paris, 2018) and another in Communications and Computer Networks Engineering (Politecnico di Torino). He earned his PhD in Computer Science from Sorbonne Université, where his research focused on automated methods for cleaning tabular data.
    Currently, he serves as the lead developer of the skrub Python library and is a member of the SODA Team at Inria. His work involves developing new features for the library and promoting its adoption through public outreach.

    ©Coll.privée

Les prochaines sessions

1

Target audience

IT developers, engineers, data scientists and data analysts.

Practical information

  • Duration: 1 /2 day (3 hours)
  • Schedule: 2pm – 5pm
  • Registration deadline: registrations close 15 days before the scheduled date.

  • Admission requirements: admission to the course is subject to prior selection. Applicants must meet the prerequisite requirements listed above.

  • Teaching format: the training is delivered online, in English, with course materials in English.

  • Group size: maximum of 12 participants.

  • In-company sessions: private sessions can be organized for groups of 5 or more participants. Please contact us using the registration form.

  • Training materials: course materials will be provided to participants.

  • Assessment and completion: assessment is conducted through a short quizze. A certificate of completion is issued at the end of the training.

  • Accessibility – disability: Inria is committed to ensuring accessibility to its training programs, both online and on-site, for people with disabilities. More details.

Pricing information

  • Price: €500 per participant

  • Discounted rates: available for groups of 5 or more participants (10% discount for 5 to 9 participants, 20% discount for 10 or more participants)

  • Member discount: companies that are members of the Aktantis cluster receive a 20% discount

  • Funding: self-funded (company or individual funds)

Program P16

This training was developed with the support of the Program P16 (lead by Inria), which aims at strengthening digital sovereignty of France and Europe in the field of AI by developing open, interoperable software libraries covering the full data-cycle.