THE OPEN SOURCE ANALYTICS CONFERENCE
Business Opportunity and Open Source Analytics
15 November 2022 (Ended)
All sessions have been recorded and published as on-demand videos categorized by topic. Check below!
OSA CON 2022 – The Open Source Analytics Conference
The Open Source Analytics Conference is back! We continue to connect developers of advanced open source analytic projects with app developers. Our theme for 2022 was using open source analytics to find business opportunities quickly and cost-efficiently. Missed out on OSA Con 2022? That’s okay. You can see it all on-demand below.
OSA CON 2022 Topic Categories:
Keynotes and Open Source
Join us as we guide you through the conference and highlight the many presenters who are contributing talks. We’ll also include a few tips about how to use the conference platform.
Presentation Slides
Presentation Slides
Robert Hodges, CEO
Every generation builds new cathedrals. For many of us, this means implementing analytic applications built on a foundation of open source.
We’ll survey developments in analytics since the last OSA Con and highlight new technologies that developers should be watching as we head into the mid-2020s.
Presentation Slides
We’ll survey developments in analytics since the last OSA Con and highlight new technologies that developers should be watching as we head into the mid-2020s.
Presentation Slides
Robert Hodges, CEO
Artyom Keydunov, Cube Dev CEO and Co-founder, gave an in-depth tutorial at OSACon 2022 on using Cube’s headless BI layer to provide consistent data for all your applications (e.g., for visualization, dashboarding, embedded analytics) while also ensuring sub-second latency, centrally managing access rules, etc.
Artyom Keydunov, CEO & Co-Founder
Peter Zaitsev, Founder of Percona, gave an insightful keynote speech at #OSACon 2022 on the state of open-source databases. Watch the talk to learn about the key developments over 2022, including the most important open-source database software releases in general, the significance of cloud-native solutions in a multi-vendor, multi-cloud world, the new criticality of security challenges, and the evolution of the open-source software industry.
Peter Zaitsev, Founder of Percona
ClickHouse
At OSACon 2022, Prafulla Gupta, Principal Architect at Times Internet, demonstrated how Times Internet empowered its Product Managers and Editors by developing an in-house product, GrowthRx, using ClickHouse Open Source Database to track and analyze user behavior to increase user retention and customer engagement.
Prafulla Gupta, Principal Architect
Our team switched our Jaeger (open source project used for distributed tracing) storage backend to ClickHouse (from Cassandra), which opened the door to a world of advanced analytics that we can run and provide our users. This talk will describe the journey from the switch, the learning curve, the challenges, and the eventual wins.
Presentation Slides
Presentation Slides
Satbir Chahal, Principal (Founding) Engineer
Analytics
Peter Zaitsev (Founder of Percona), Ed Huang (CEO & Co-founder of PingCAP), Alexander Zaitsev (CTO of Altinity), and Robert Hodges (CEO of Altinity) have a #OSACon panel discussion on what HTAP is and how it works, when it’s better to add an analytic database, the questions developers should ask while designing new applications, and more.
Peter Zaitsev, Founder of Percona,
Ed Huang, CEO & Co-founder of PingCAP,
Alexander Zaitsev, CTO of Altinity, and Robert Hodges, CEO of Altinity
Ed Huang, CEO & Co-founder of PingCAP,
Alexander Zaitsev, CTO of Altinity, and Robert Hodges, CEO of Altinity
ChistaDATA’s ClickHouse Database Admin, Sri Sakthivel M.D., outlined how real-time data platforms deliver events immediately to applications and allow businesses to react quickly at #OSACon 2022. From how to build a fast, hardware-efficient hardware platform that is easy to manage with SQL, via an intro to RedPanda and ksqlDB, through to the details of a specific use case showing data travel from #MySQL to kqslDB. Watch him in action!
Sri Sakthivel M.D., ClickHouse Database Admin, ChistaDATA
Time series data is special. Not only its nature but also the ways that we store and interact with it. In this talk, we’ll cover the differences between storing time series data in classic relational databases
and a new generation of time series databases like VictoriaMetrics and Prometheus.
Presentation Slides
and a new generation of time series databases like VictoriaMetrics and Prometheus.
Presentation Slides
Roman Khavronenko, Co-Founder
The data lakehouse is one of the most exciting trends in the data space promising to merge the best aspects of data lakes and data warehouses without either of their problems. Open source tech is making this promise a reality and in this talk Dremio Developer Advocate, Alex Merced, explores these technologies.
Presentation Slides
Presentation Slides
Alex Merced, Developer Advocate
Ahana’s Co-founder and CEO, Steven Mih took the stage at #OSACon 2022 to talk about Linux Foundation’s latest Presto innovations (eg., multi-level caching) and how it is used for the open SQL lakehouse. Watch his talk to get a quick look at the history, use cases, and technical innovations.
Steven Mih, Co-Founder and CEO
Applications
Mary Grygleski and Mark Needham showed how analytical queries can be run on top of Apache Pulsar’s event data with Apache Pinot at #OSACon 2022. They explored the integration between Pulsar and Pinot and demonstrated in detail how to build a real-time analytics dashboard with these technologies.
Mary Grygleski, Streaming Developer Advocate at DataStax
Mark Needham, Developer Relations Engineer at StarTree
Mark Needham, Developer Relations Engineer at StarTree
David Li, Software Engineer at Voltron Data, gave an in-depth talk at #OSACon on #ApacheArrow, highlighting use cases where Arrow has accelerated analytics workflows by as much as 100x and showing where Arrow is going, with special attention to database connectivity.
David Li, Software Engineer, Voltron Data
Ponder’s Co-founder and CEO, Doris Lee, explained the prevailing productivity challenges that the data team faces when going from prototype to production and how Modin addresses them at #OSACon 2022. She presented case studies highlighting how Modin empowers data teams to seamlessly accelerate their Pandas workflows. See her in action!
Doris Lee, Co-founder and CEO, Ponder
In this talk we’ll go through how we have designed and built over 20 different SDKs to collect events from all sorts of applications (from web & mobile to IoT to server-side), allowing users to collect a rich event stream of data. Then we’ll dive into, and demonstrate, the cross-warehouse downstream data models which aggregate the event stream into easy-to-consume data products for analytics, AI, composable CDP, recommendation engines, and many other use cases.
Presentation Slides
Presentation Slides
Paul Boocock, Head of Engineering, Snowplow
Knowledge Base
Frontend engineering went through a revolution in the last decade. I’ll recap what happened, and how a similar revolution started in data engineering.
Presentation Slides
Presentation Slides
Pete Hunt, Head of Engineering, Elementlw
OSACon 2022: Rachel Pedreschi, VP of Technical Services at Ahana, explained how you can decide whether data warehouse or data lakehouse is best for your use case and workloads. She discussed the data landscape, how to approach your data warehouse/data lakehouse strategy, and how some real-world customers are using Presto to bring analytics to the data lake.
Rachel Pedreschi, VP of Technical Services, Ahana
Michael shows how the signal correlation in observability use cases helps you to spot issues faster, optimize code, or make you more productive in delivering features.
Presentation Slides
Presentation Slides
Michael Hausenblas, Solution Engineering Lead, AWS
Tim Spann & David Kjerrumgaard, Developer Advocates at StreamNative, showed how to click into new streaming applications the easy way with Apache Pulsar, ClickHouse, and Open Source. The #OSACon 2022 talk is a quick introduction on how to build modern data streaming applications.
Tim Spann and David Kjerrumgaard, Developer Advocates
Airbyte’s Senior Data/Analytics Engineer and Director of Connector Engineering, Alexandra Gronemeyer and Brian Leonard, recently presented a fascinating talk at #OSACon 2022. They extracted, loaded, transformed, and analyzed data to better learn about developer communities. Along the way, they also showed how to put open-source tools like Airbyte, ClickHouse, dbt, and Metabase into action to answer business questions.
Alexandra Gronemeyer, Senior Data/Analytics Engineer
Brian Leonard, Director of Connector Engineering
Brian Leonard, Director of Connector Engineering
Sign Up for Updates
We’ll keep you informed of OSA-CON related news.
Thank You to Our Community Partners
We could not have pulled this together without your help promoting and locating speakers.
This is a community of communities and we are so happy to be a part of it all. Here’s to you all!
OSA-CON 2022 Sponsors
cube.dev
Open source headless BI: Consume data from any data source, organize it into consistent metrics, and use it with every app.
cube.dev
cube.dev
RudderStack
RudderStack is the warehouse-first, customer data platform (CDP) built for developers. The company takes a new approach to building and operating customer data infrastructure, making it easy to collect, unify, transform, and store customer data as well as securely route it to a wide range of common, popular tools.
rudderstack.com
rudderstack.com
Percona
Percona is a leading provider of unbiased open source database solutions that allow orgs to easily, securely and affordably maintain business agility, minimize risks, and stay competitive.
percona.com
percona.com
PostHog
PostHog is an open-source product analytics platform that offers a suite of tools, including funnels, heat maps, session recording and more, all in a single platform.
posthog.com
posthog.com
VictoriaMetrics
VictoriaMetrics is a fast and scalable open source time series database and monitoring solution, with the company behind it focusing on helping individuals and organizations address their big data challenges through state-of-the-art monitoring and observability solutions.
victoriametrics.com
victoriametrics.com
Aiven
Aiven provides managed open source data technologies, like PostgreSQL, Apache Kafka and OpenSearch, on all major clouds. Aiven enables customers to drive business results from open source that trigger true transformations far beyond their own backyard.
aiven.io
aiven.io
Meltano
Meltano enables everyone to realize the full potential of their data. We are building a modular, open source DataOps OS to be the foundation of every team’s ideal data stack. Meltano simplifies configuration, deployment and monitoring, unites best-in-class open source tools and technologies for the data lifecycle. It allows data teams to benefit from DevOps best practices such as version control, code review and CI/CD.
meltano.com
meltano.com
Airbyte
Airbyte is the modern open-source ELT standard that replicates data from the long tail of APIs, databases & files to data warehouses, lakes and other destinations. Airbyte Cloud disrupts the ELT market with its transparent volume-based pricing and open-source extensibility.
airbyte.com
airbyte.com
OpsVerse
OpsVerse is the creator of OpsVerse ONE, an IDP that unifies DevOps tools, micro services catalog and documentation. The company also offers fully-managed, open source, best-of-breed DevOps tools that can run anywhere in minutes. With OpsVerse’s private SaaS framework, anyone can achieve enhanced data residency, governance, and audit controls without spending additional engineering resources.
opsverse.io
opsverse.io
Tinybird
Tinybird is a serverless analytical backend for developers. Build low-latency APIs in minutes with nothing but SQL.
tinybird.co
tinybird.co
Ponder
Ponder is builds enterprise-ready tools for rapid, flexible experimentation with data at scale. Operate on data at any scale, while continuing to use the familiar Pandas API. Powered by open-source Modin and Lux.
ponder.io
ponder.io
Times Internet
Times Internet is an Indian internet technology company, based in Gurgaon, which owns, operates and invests in various internet-led products, services and technology. It is the digital arm of the Times Group, the largest media conglomerate in India.
timesinternet.in
timesinternet.in
Ahana
Ahana, the Presto company, offers the only managed service for Presto on AWS with the vision to simplify open data lake analytics. Presto, the open source project created by Facebook and used at Uber, Twitter and thousands more, is the de facto standard for fast SQL processing on data lakes.
ahana.io
ahana.io
PingCAP
PingCAP is an enterprise-grade software service provider committed to delivering an open-source, cloud-native, one-stop database solution for growth-oriented clients to focus on their business priorities.
pingcap.com
pingcap.com
Dremio
Dremio is the lakehouse company. Hundreds of organizations, including 3 of the Fortune 5, use Dremio to deliver mission-critical BI on the data lake. As the original creator of Apache Arrow, Dremio is on a mission to reinvent SQL for data lakes and meet customers where they are in their cloud journey. Dremio was founded in 2015 and is headquartered in Santa Clara, CA.
dremio.com
dremio.com