<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Ricardo Liberato: Observability]]></title><description><![CDATA[Grafana, Prometheus, Loki, Influxdata tips and howto's]]></description><link>https://www.liberato.pt/s/observability</link><image><url>https://substackcdn.com/image/fetch/$s_!MDPa!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2Fceec0588-8a7f-4fb4-bc6d-1162b686f4ce_2501x2501.png</url><title>Ricardo Liberato: Observability</title><link>https://www.liberato.pt/s/observability</link></image><generator>Substack</generator><lastBuildDate>Sun, 05 Apr 2026 20:10:07 GMT</lastBuildDate><atom:link href="https://www.liberato.pt/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Ricardo Liberato]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[riclib@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[riclib@substack.com]]></itunes:email><itunes:name><![CDATA[Ricardo Liberato]]></itunes:name></itunes:owner><itunes:author><![CDATA[Ricardo Liberato]]></itunes:author><googleplay:owner><![CDATA[riclib@substack.com]]></googleplay:owner><googleplay:email><![CDATA[riclib@substack.com]]></googleplay:email><googleplay:author><![CDATA[Ricardo Liberato]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[What to read on Devops Observability]]></title><description><![CDATA[My recommendations on books to get you started on your observability journey]]></description><link>https://www.liberato.pt/p/what-to-read-on-devops-observability</link><guid isPermaLink="false">https://www.liberato.pt/p/what-to-read-on-devops-observability</guid><dc:creator><![CDATA[Ricardo Liberato]]></dc:creator><pubDate>Sun, 23 Apr 2023 12:47:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!fmcd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>DevOps observability is a critical component of modern software development and operations. It is the practice of using monitoring, logging, metrics, tracing, and alerting to gain insights into the behavior of complex distributed systems. DevOps observability is essential for identifying and resolving issues quickly, minimizing downtime, and ensuring the reliability and performance of applications.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fmcd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fmcd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!fmcd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!fmcd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!fmcd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fmcd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/afb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1675973,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fmcd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!fmcd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!fmcd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!fmcd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafb20c53-8107-43c5-9f70-fbf01d76d497_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you are interested in learning more about DevOps observability, here are the top five books that I recommend:</p><ol><li><p>"<strong>Site Reliability Engineering: How Google Runs Production Systems</strong>" by Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy: This book provides an in-depth look at Google's approach to managing large-scale systems and how they use observability to monitor and troubleshoot their production environments. It covers topics such as monitoring, logging, tracing, and alerting, and provides practical advice for building highly reliable and scalable systems.</p></li><li><p>"<strong>Monitoring and Observability</strong>" by Cindy Sridharan: This book offers a comprehensive overview of observability, its key concepts, and how it can be applied to monitoring and troubleshooting distributed systems. It covers topics such as logging, metrics, tracing, and alerting, and provides practical advice for building effective observability systems.</p></li><li><p>"<strong>Observability Engineering</strong>" by Charity Majors and Lindsey Thorne: This book provides a practical guide to building and managing highly observable systems, with a focus on logging, metrics, tracing, and alerting. It covers topics such as service-level objectives (SLOs), error budgets, and chaos engineering, and provides practical advice for building and scaling observability systems.</p></li><li><p>"<strong>Effective DevOps: Building a Culture of Collaboration, Affinity, and Tooling at Scale</strong>" by Jennifer Davis and Katherine Daniels: While not exclusively focused on observability, this book covers all aspects of DevOps, including observability, monitoring, and logging. It provides a comprehensive overview of the DevOps culture, mindset, and practices, and provides practical advice for building and scaling effective DevOps teams.</p></li><li><p>"<strong>Distributed Systems Observability</strong>" by Liran Haimovitch: This book focuses on observability in distributed systems and how to apply it to detect and diagnose problems in complex, multi-layered architectures. It covers topics such as distributed tracing, log aggregation, and anomaly detection, and provides practical advice for building effective observability systems.</p></li></ol><p>I recommend these books to anyone who wants to learn more about DevOps observability and build more reliable, scalable, and resilient systems. These books offer practical advice, real-world examples, and best practices that can help you improve your observability and DevOps practices.</p><p>What books have you read that should be on this list?</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.liberato.pt/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Ricardo Liberato! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Breaking the High Cardinality Barrier]]></title><description><![CDATA[Leveraging the Synergy between Grafana Loki and Prometheus to Monitor High Cardinality Jobs]]></description><link>https://www.liberato.pt/p/breaking-the-high-cardinality-barrier</link><guid isPermaLink="false">https://www.liberato.pt/p/breaking-the-high-cardinality-barrier</guid><dc:creator><![CDATA[Ricardo Liberato]]></dc:creator><pubDate>Mon, 17 Apr 2023 04:37:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!w2Rc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I recently wrote a <a href="https://grafana.com/blog/2022/12/02/monitoring-high-cardinality-jobs-with-grafana-grafana-loki-and-prometheus/">blog post</a> for Grafana about my experience using Grafana, Prometheus, Grafana Loki, and our my custom-built exporters to monitor high cardinality jobs. This is based on experience monitoring a 3000 node data lake and especially it&#8217;s data load process.</p><p>In the post, I explain how we were able to leverage the deep synergies between Loki and Prometheus to monitor the actual performance of jobs, allowing us to reduce cycle time for loads from 20 minutes to less than six minutes. By combining metrics with logs information, we were able to deeply understand where compute and memory were being efficiently used and where it was being wasted. This unlocked 40% savings on the cost of the cloud infrastructure supporting these stream jobs.</p><p>I go deeper into these two use cases and also highlight the job_exporter we built that leverages the symbiosis between Prometheus and Loki to break the high cardinality barrier. Our job monitoring journey started by implementing one-off solutions for Databricks and for Azure Data Factory. The learnings from these two implementations &#8212; and the need to extend to more platforms &#8212; led us to build a generic job_exporter that is easily extensible.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w2Rc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w2Rc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 424w, https://substackcdn.com/image/fetch/$s_!w2Rc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 848w, https://substackcdn.com/image/fetch/$s_!w2Rc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 1272w, https://substackcdn.com/image/fetch/$s_!w2Rc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w2Rc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png" width="1456" height="767" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:767,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:&quot;&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!w2Rc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 424w, https://substackcdn.com/image/fetch/$s_!w2Rc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 848w, https://substackcdn.com/image/fetch/$s_!w2Rc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 1272w, https://substackcdn.com/image/fetch/$s_!w2Rc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc81c0dee-4607-41da-b668-a22f6fcf2d5a_1600x843.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I hope my experience inspires you to look for what you can achieve by breaking the high cardinality barrier. Check out the full blog post on the Grafana blog to learn more:<br><a href="https://grafana.com/blog/2022/12/02/monitoring-high-cardinality-jobs-with-grafana-grafana-loki-and-prometheus/">https://grafana.com/blog/2022/12/02/monitoring-high-cardinality-jobs-with-grafana-grafana-loki-and-prometheus/<br></a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.liberato.pt/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Ricardo Liberato's Substack! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>