THE OPEN SOURCE ANALYTICS CONFERENCE

Business Opportunity and Open Source Analytics

15 November 2022

OSA CON 2022 – The Open Source Analytics Conference

The Open Source Analytics Conference is back!  We’re continue to connect developers of advanced open source analytic projects with app developers. Our theme for 2022 was using open source analytics to find business opportunities quickly and cost-efficiently. Missed out on OSA Con 2022? That’s okay. You can see it all on-demand below. 

OSA CON 2022 Topic Categories:


Keynotes and Open Source

OSA Con 2022

Welcome to OSA Con 2022!

Welcome to OSA Con 2022!
Join us as we guide you through the conference and highlight the many presenters who are contributing talks. We’ll also include a few tips about how to use the conference platform.
Presentation Slides
Robert Hodges, CEO
Presenter Logo

OSA Con 2022

The Open Source Analytic Universe, Version 2022

The Open Source Analytic Universe, Version 2022
Every generation builds new cathedrals. For many of us, this means implementing analytic applications built on a foundation of open source.

We’ll survey developments in analytics since the last OSA Con and highlight new technologies that developers should be watching as we head into the mid-2020s.
Presentation Slides
Robert Hodges, CEO
Presenter Logo

OSA Con 2022

Panel Discussion: Open Source Analytics, Public Clouds, and VC Mega-Investment

Panel Discussion: Open Source Analytics, Public Clouds, and VC Mega-Investment
See Doug Cutting, Maxime Beauchemin, and David Nalley in a panel discussion with Robert Hodges (CEO of Altinity) at OSACon 2022.
Doug Cutting, Maxime Beauchemin, David Nalley, Robert Hodges
Presenter Logo Presenter Logo Presenter Logo

OSA Con 2022

Navigating the Modern Data Stack with Open-source Headless BI

Navigating the Modern Data Stack with Open-source Headless BI
Artyom Keydunov, Cube Dev CEO and Co-founder, gave an in-depth tutorial at OSACon 2022 on using Cube’s headless BI layer to provide consistent data for all your applications (e.g., for visualization, dashboarding, embedded analytics) while also ensuring sub-second latency, centrally managing access rules, etc.
Artyom Keydunov, CEO & Co-Founder
Presenter Logo

ClickHouse

OSA Con 2022

Tips and Tricks to Keep Your Queries under 100ms with ClickHouse

Tips and Tricks to Keep Your Queries under 100ms with ClickHouse
Javi Santana, Co-Founder
Presenter Logo

OSA Con 2022

Using ClickHouse Database to Power an Analytics and Customer Engagement Platform

Using ClickHouse Database to Power an Analytics and Customer Engagement Platform
At OSACon 2022, Prafulla Gupta, Principal Architect at Times Internet, demonstrated how Times Internet empowered its Product Managers and Editors by developing an in-house product, GrowthRx, using ClickHouse Open Source Database to track and analyze user behavior to increase user retention and customer engagement.
Prafulla Gupta, Principal Architect
Presenter Logo

OSA Con 2022

Switching Jaeger Distributed Tracing to ClickHouse to Enable Advanced Performance Management (APM)

Switching Jaeger Distributed Tracing to ClickHouse to Enable Advanced Performance Management (APM)
Our team switched our Jaeger (open source project used for distributed tracing) storage backend to ClickHouse (from Cassandra), which opened the door to a world of advanced analytics that we can run and provide our users. This talk will describe the journey from the switch, the learning curve, the challenges, and the eventual wins.
Presentation Slides
Maxime Beauchemin, CEO & Co-Founder
Presenter Logo

OSA Con 2022

ClickHouse: What is Behind the Fastest Columnar Database

ClickHouse: What is Behind the Fastest Columnar Database
Olena Kutsenko, Aiven’s Developer Advocate, demonstrated how to get the most out of ClickHouse and avoid pitfalls, explained OLAP and the architecture of columnar databases, and, lastly, reasons behind the most puzzling concepts of ClickHouse.
Olena Kutsenko, Developer Advocate, Aiven
Presenter Logo

Analytics

OSA Con 2022

Building an Analytic Extension to MySQL with ClickHouse and Open Source

Building an Analytic Extension to MySQL with ClickHouse and Open Source
Vadim Tkachenko (Percona CTO) and Kanthi Subramanian (Altinity Developer Advocate) presented a talk at #OSACon 2022.
Vadim Tkachenko, CTO, Percona
Kanthi Subramanian, Developer Advocate, Altinity
Presenter Logo Presenter Logo

OSA Con 2022

How Can HTAP Databases Simplify Your Data Infrastructure?

How Can HTAP Databases Simplify Your Data Infrastructure?
Liquan Pei, Senior Database Engineer at PingCAP, explored the HTAP database, its issues and challenges, and how to address them in TiDB — an open-source MySQL-compatible HTAP database at #OSACon 2022. Watch his talk!
Liquan Pei, Senior Database Engineer
Presenter Logo

OSA Con 2022 Panel Discussion

HTAP vs OLTP/OLAP

HTAP vs OLTP/OLAP
Peter Zaitsev (Founder of Percona), Ed Huang (CEO & Co-founder of PingCAP), Alexander Zaitsev (CTO of Altinity), and Robert Hodges (CEO of Altinity) have a #OSACon panel discussion on what HTAP is and how it works, when it’s better to add an analytic database, the questions developers should ask while designing new applications, and more.
Peter Zaitsev, Founder of Percona,
Ed Huang, CEO & Co-founder of PingCAP,
Alexander Zaitsev, CTO of Altinity, and Robert Hodges, CEO of Altinity
Presenter Logo Presenter Logo Presenter Logo

OSA Con 2022

Quick Reflexes: Building Real-Time Data Analytics with Redpanda and ksqlDB

Quick Reflexes: Building Real-Time Data Analytics with Redpanda and ksqlDB
ChistaDATA’s ClickHouse Database Admin, Sri Sakthivel M.D., outlined how real-time data platforms deliver events immediately to applications and allow businesses to react quickly at #OSACon 2022. From how to build a fast, hardware-efficient hardware platform that is easy to manage with SQL, via an intro to RedPanda and ksqlDB, through to the details of a specific use case showing data travel from #MySQL to kqslDB. Watch him in action!
Sri Sakthivel M.D., ClickHouse Database Admin, ChistaDATA
Presenter Logo

OSA Con 2022

Specifics of data analysis in Time Series Databases

Specifics of data analysis in Time Series Databases
Time series data is special. Not only its nature but also the ways that we store and interact with it. In this talk, we’ll cover the differences between storing time series data in classic relational databases
and a new generation of time series databases like VictoriaMetrics and Prometheus.
Presentation Slides
Roman Khavronenko, Co-Founder
Presenter Logo

OSA Con 2022

Apache Iceberg: An Architectural Look Under the Covers

Apache Iceberg: An Architectural Look Under the Covers
The data lakehouse is one of the most exciting trends in the data space promising to merge the best aspects of data lakes and data warehouses without either of their problems. Open source tech is making this promise a reality and in this talk Dremio Developer Advocate, Alex Merced, explores these technologies.
Presentation Slides
Alex Merced, Developer Advocate
Presenter Logo

OSA Con 2022

Presto for your Fast, Open SQL Lakehouse

Presto for your Fast, Open SQL Lakehouse
Ahana’s Co-founder and CEO, Steven Mih took the stage at #OSACon 2022 to talk about Linux Foundation’s latest Presto innovations (eg., multi-level caching) and how it is used for the open SQL lakehouse. Watch his talk to get a quick look at the history, use cases, and technical innovations.
Steven Mih, Co-Founder and CEO
Presenter Logo

Applications

OSA Con 2022

Building a Real-time Analytics Application with Apache Pulsar and Apache Pinot

Building a Real-time Analytics Application with Apache Pulsar and Apache Pinot
Vadim Tkachenko, CTO, Percona
Kanthi Subramanian, Developer Advocate, Altinity
Presenter Logo Presenter Logo

OSA Con 2022

Arrow in Flight: New Developments in Data Interoperability

Arrow in Flight: New Developments in Data Interoperability
David Li, Software Engineer at Voltron Data, gave an in-depth talk at #OSACon on #ApacheArrow, highlighting use cases where Arrow has accelerated analytics workflows by as much as 100x and showing where Arrow is going, with special attention to database connectivity.
David Li, Software Engineer, Voltron Data
Presenter Logo

OSA Con 2022

Scaling your Pandas Analytics with Modin

Scaling your Pandas Analytics with Modin
Ponder’s Co-founder and CEO, Doris Lee, explained the prevailing productivity challenges that the data team faces when going from prototype to production and how Modin addresses them at #OSACon 2022. She presented case studies highlighting how Modin empowers data teams to seamlessly accelerate their Pandas workflows. See her in action!
Doris Lee, Co-founder and CEO, Ponder
Presenter Logo

OSA Con 2022

Building Event Collection SDKs and Data Models

Building Event Collection SDKs and Data Models
In this talk we’ll go through how we have designed and built over 20 different SDKs to collect events from all sorts of applications (from web & mobile to IoT to server-side), allowing users to collect a rich event stream of data. Then we’ll dive into, and demonstrate, the cross-warehouse downstream data models which aggregate the event stream into easy-to-consume data products for analytics, AI, composable CDP, recommendation engines, and many other use cases.
Presentation Slides
Paul Boocock, Head of Engineering, Snowplow
Presenter Logo

Knowledge Base

OSA Con 2022

What Data Engineering Can Learn from Frontend Engineering

What Data Engineering Can Learn from Frontend Engineering
Frontend engineering went through a revolution in the last decade. I’ll recap what happened, and how a similar revolution started in data engineering.
Presentation Slides
Pete Hunt, Head of Engineering, Elementlw
Presenter Logo

OSA Con 2022

To Warehouse or to Lakehouse? And How Do Open Source Projects Compare?

To Warehouse or to Lakehouse? And How Do Open Source Projects Compare?
OSACon 2022: Rachel Pedreschi, VP of Technical Services at Ahana, explained how you can decide whether data warehouse or data lakehouse is best for your use case and workloads. She discussed the data landscape, how to approach your data warehouse/data lakehouse strategy, and how some real-world customers are using Presto to bring analytics to the data lake.
Rachel Pedreschi, VP of Technical Services, Ahana
Presenter Logo

OSA Con 2022 Panel Discussion

Signal Correlation, the Ho11y Grail

Signal Correlation, the Ho11y Grail
Michael shows how the signal correlation in observability use cases helps you to spot issues faster, optimize code, or make you more productive in delivering features.
Presentation Slides
Michael Hausenblas, Solution Engineering Lead, AWS
Presenter Logo

OSA Con 2022

Streaming Data Made Easy

Streaming Data Made Easy
Tim Spann & David Kjerrumgaard, Developer Advocates at StreamNative, showed how to click into new streaming applications the easy way with Apache Pulsar, ClickHouse, and Open Source. The #OSACon 2022 talk is a quick introduction on how to build modern data streaming applications.
Tim Spann and David Kjerrumgaard, Developer Advocates
Presenter Logo

OSA Con 2022

Extract, Transform, and Learn about Your Developers

Extract, Transform, and Learn about Your Developers
Airbyte’s Senior Data/Analytics Engineer and Director of Connector Engineering, Alexandra Gronemeyer and Brian Leonard, recently presented a fascinating talk at #OSACon 2022. They extracted, loaded, transformed, and analyzed data to better learn about developer communities. Along the way, they also showed how to put open-source tools like Airbyte, ClickHouse, dbt, and Metabase into action to answer business questions.
Alexandra Gronemeyer, Senior Data/Analytics Engineer
Brian Leonard, Director of Connector Engineering
Presenter Logo

Sign Up for Updates

We’ll keep you informed of OSA-CON related news.

Thank You to Our Community Partners

We could not have pulled this together without your help promoting and locating speakers.
This is a community of communities and we are so happy to be a part of it all. Here’s to you all!

OSA-CON 2022 Sponsors

cube.dev

Open source headless BI: Consume data from any data source, organize it into consistent metrics, and use it with every app.
cube.dev

RudderStack

RudderStack is the warehouse-first, customer data platform (CDP) built for developers. The company takes a new approach to building and operating customer data infrastructure, making it easy to collect, unify, transform, and store customer data as well as securely route it to a wide range of common, popular tools.
rudderstack.com

Percona

Percona is a leading provider of unbiased open source database solutions that allow orgs to easily, securely and affordably maintain business agility, minimize risks, and stay competitive.
percona.com

PostHog

PostHog is an open-source product analytics platform that offers a suite of tools, including funnels, heat maps, session recording and more, all in a single platform.
posthog.com

VictoriaMetrics

VictoriaMetrics is a fast and scalable open source time series database and monitoring solution, with the company behind it focusing on helping individuals and organizations address their big data challenges through state-of-the-art monitoring and observability solutions.
victoriametrics.com

Aiven

Aiven provides managed open source data technologies, like PostgreSQL, Apache Kafka and OpenSearch, on all major clouds. Aiven enables customers to drive business results from open source that trigger true transformations far beyond their own backyard.
aiven.io

Meltano

Meltano enables everyone to realize the full potential of their data. We are building a modular, open source DataOps OS to be the foundation of every team’s ideal data stack. Meltano simplifies configuration, deployment and monitoring, unites best-in-class open source tools and technologies for the data lifecycle. It allows data teams to benefit from DevOps best practices such as version control, code review and CI/CD.
meltano.com

Airbyte

Airbyte is the modern open-source ELT standard that replicates data from the long tail of APIs, databases & files to data warehouses, lakes and other destinations. Airbyte Cloud disrupts the ELT market with its transparent volume-based pricing and open-source extensibility.
airbyte.com

OpsVerse

OpsVerse is the creator of OpsVerse ONE, an IDP that unifies DevOps tools, micro services catalog and documentation. The company also offers fully-managed, open source, best-of-breed DevOps tools that can run anywhere in minutes. With OpsVerse’s private SaaS framework, anyone can achieve enhanced data residency, governance, and audit controls without spending additional engineering resources.
opsverse.io

Tinybird

Tinybird is a serverless analytical backend for developers. Build low-latency APIs in minutes with nothing but SQL.
tinybird.co

Ponder

Ponder is builds enterprise-ready tools for rapid, flexible experimentation with data at scale. Operate on data at any scale, while continuing to use the familiar Pandas API. Powered by open-source Modin and Lux.
ponder.io

Times Internet

Times Internet is an Indian internet technology company, based in Gurgaon, which owns, operates and invests in various internet-led products, services and technology. It is the digital arm of the Times Group, the largest media conglomerate in India.
timesinternet.in

Ahana

Ahana, the Presto company, offers the only managed service for Presto on AWS with the vision to simplify open data lake analytics. Presto, the open source project created by Facebook and used at Uber, Twitter and thousands more, is the de facto standard for fast SQL processing on data lakes. 
ahana.io

PingCAP

PingCAP is an enterprise-grade software service provider committed to delivering an open-source, cloud-native, one-stop database solution for growth-oriented clients to focus on their business priorities.
pingcap.com

Dremio

Dremio is the lakehouse company. Hundreds of organizations, including 3 of the Fortune 5, use Dremio to deliver mission-critical BI on the data lake. As the original creator of Apache Arrow, Dremio is on a mission to reinvent SQL for data lakes and meet customers where they are in their cloud journey. Dremio was founded in 2015 and is headquartered in Santa Clara, CA.
dremio.com