Mackenzie is the Global Startup Evangelist at AWS. His days are spent traveling the globe to meet startups, share their stories, and connect engineering teams together. Every day there are a large number of startups launching on AWS across every imaginable industry. It’s Mackenzie’s mission to find stories of startups that are helping to improve the world and share these stories with a wide audience.
Â
Prior to joining AWS, Mackenzie was the Head of Technical Operations at Betterment, the world’s largest independent robo-advisor based in NYC which manages over $8B in assets. Mackenzie was a founding engineer and Head of Technical Operations at Oscar Health, an insurance startup also based in NYC, helping to grow the company to over 400+ employees.
This event will focus on Big Data and Observability solutions available in the Cloud Native Computing Foundation (CNCF) ecosystem and most relevant to our strategic customers' use cases. For Big Data, we will discuss Big Data schedulers, Spark on Kubernetes, and Iceberg. For Observability, we invite customers to talk about Kubernetes Observability and Monitoring with events, logs, traces, and metrics, how Observability is utilized to enable data analytics workloads, and how Big Data can help with faster mean time to detect (MTTD) or mean time to recover (MTTR) in case of a failure.
This event will focus on Big Data and Observability solutions available in the Cloud Native Computing Foundation (CNCF) ecosystem and most relevant to our strategic customers' use cases. For Big Data, we will discuss Big Data schedulers, Spark on Kubernetes, and Iceberg. For Observability, we invite customers to talk about Kubernetes Observability and Monitoring with events, logs, traces, and metrics, how Observability is utilized to enable data analytics workloads, and how Big Data can help with faster mean time to detect (MTTD) or mean time to recover (MTTR) in case of a failure.
Intuit has been increasingly focused on ensuring we deliver the best customer experience and having better visibility and response to problems encountered by customers. Developers at Intuit leverage our Real User Monitoring Solution to measure and alert on availability of key user functionalities such as payments or tax filing. Intuit's AIOps platform powers our alerting capability, which detects anomalies in customer experiences using real-time anomaly detection.
In this talk, Venkatesh Rangarajan and Vigith Maurice will share how we apply BigData and AIOPs techniques with Real User Monitoring and real-time tracing data to reduce the time it takes to detect and triage customer impacting issues to near zero time.
Â
Speakers: Venkatesh Rangarajan, Observability Product Manager, Vigith Maurice, Observability Technical Lead
Developers at Adobe leverage Ethos multi-tenant and multi-cloud compute platform, based on Kubernetes. This platform is leveraged across Adobe by various product teams to host their solutions. As time has progressed, the complexity of services has increased which in-turn has required investment into solutions beyond the core observability stack. To address these challenges, Ethos has built capabilities centered around diagnosing issues and identifying root causes. The focus is to provide self-service capabilities to development teams to detect root cause and remediate issues. This has helped in reducing MTTR for various incidents and improved overall reliability. These tools have also been integrated with "Get Help" workflow and is a key capability of Adobe's Internal Developer platform (IDP). In this session, attendees will learn about Adobe's observability stack and self-service diagnostics capabilities offered by Ethos.
Â
Speakers: Shibashis Mishra, Senior Engineering Manager & Rohan Kapoor, Product Manager
Integrating an Observability solution for Amazon Aurora Database Into Workday's Centralized Observability Platform
Cloud Relational Database Services such as Amazon Aurora, offer robust, native observability features to help customers. Large enterprise companies like Workday build their own centralized Observability platform. Continuing to depend on the native observability suite for cloud databases like Aurora, can result in a fragmented picture creating a challenge integrating logging and monitoring of a complex cloud database system to the existing Observability platform, to achieve a "single pane of glass". In this session, Sandesh Achar, Director Cloud Engineering, and Nathan Tisuela, Software Engineer, from Workday discuss how they integrated an observability solution for Amazon Aurora database with their centralized Observability platform.
Speaker: Sandesh Achar, Director Cloud Engineering & Nathan Tisuela, Software Engineer, Workday
Cloud native solutions have a large number of software layers between system hardware and the layers at which consumers receive a desired form of a service whether it is IaaS, PaaS, or SaaS, running in parallel with many other services. These abstractions make it difficult to observe and understand the observed performance effects at the service delivery level in terms of the underlying performance phenomena in hardware and at each of the intermediary layers of abstraction or virtualization. This talk will sketch how one may utilize the capabilities of hardware performance monitoring units in conjunction with different other types of monitoring in software layers to obtain insights into sources of performance loss or opportunities for performance gain. Several factors complicate this exercise in connecting the dots; for example, many operations in hardware and software run in parallel, even logically sequential instructions execute out of program order to emulate data flow machines, and multi-tenant operation (for efficiency, and for economy of scale) makes it hard to account for the effects of different tenants upon each others' performance. In this presentation we describe some lessons in putting together the big picture view by collecting and analyzing key pieces of explanatory detail from the hardware PMUs and correlating them to various other measures of performance in the different software layers.
Speaker: Harshad Sane, Principal Engineer, Intel
Accurate prediction of cloud workload resource consumption is a crucial tool for optimizing the cloud capacity usage and maximizing the value of cloud assets. Estimating resource requirements and setting the service configuration based on experience is a useful approach. However, resource requirements backed by data evidence allows operators and administrators to make informed decisions for setting service configurations. In this talk, Shrey will describe resource tier recommendations for data science workloads in the cloud, using Jupyterhub application as an example. He will demonstrate how to fetch CPU and memory telemetry data from user pods on the Operate First cluster and train a learning algorithm to recommend tiers. He will then discuss the implications of such an approach and how it can be extended for use cases like detecting and forecasting node failures and reducing the service energy footprint. Attendees will learn how to use telemetry data from their clusters to optimize their resource usage and drive decisions with AIOps for cloud environments.
Speaker: Shrey Anand, Data Scientist, Emerging Technologies, Red Hat
Trino is a distributed SQL engine that is widely used for big data analytics. However, it comes with a set of challenges when run at large scale, particularly when it comes to using Trino for batch ETL. In this presentation, we discuss our experiences developing Huron, Salesforce's internal observability platform which leverages Trino to run analytics across all our telemetry. Huron is used by service owners, SREs, and engineers to obtain insights from their observability data with the goal of enhancing service availability. We describe some of the key considerations involved in running large scale ETLs with Trino in the cloud, including challenges with scaling writes against object stores, factors impacting cost to serve, and how we made the decision to switch to Trino-Iceberg.
Speakers: Conor McAvoy, Software Engineer, Salesforce & Vincent Poon, Software Architect, Salesforce Monitoring Cloud
Pinterest is a company that provides inspiration to build a life you love. Through advanced recommendations engines, ML pipelines, in-house analytics engine and exabyte-scale datalake, we are able to delight our users with the highly targeted inspiration for their projects. Over the years, Pinterest DataEngineering has developed several in-house analytics platforms based on Hadoop ecosystem and YARN. Recently we ran an experiment to explore moving Spark workloads to k8s, evaluate the performance at scale and the overall viability using k8s for spark.
In this presentation, we will talk about our business drivers for considering k8s for batch workloads, results on running large scale tests on EKS during our evaluation, lessons learned and what we’d do differently.
Speakers: Rainie Li - Software Engineer, DataEng, William Tom - Software Engineer, DataEng
Privacy | Site Terms |Â