Newslurp

<< Stories

Scaling GPUs ⚖️, Spark Declarative Pipelines ✨, AirFrance’s Automation Platform ✈️

TLDR DevOps <dan@tldrnewsletter.com>

January 12, 12:11 pm

TLDR DevOps
Modal runs a globally distributed pool of over 20,000 GPUs across major cloud providers using continuous benchmarking and standardized machine images ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌  ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ ‌ 

TLDR

TLDR DevOps 2026-01-12

📱

News & Trends

Elastic achieves ISO 27701 certification (3 minute read)

Elastic has achieved ISO/IEC 27701 certification, confirming a robust Privacy Information Management System covering all cloud, serverless, and self-managed deployments. The certification strengthens global trust, simplifies customer compliance, and reinforces Elastic's commitment to continuous privacy and security improvement.
Introducing Multiple Registry Support on DigitalOcean Container Registry (4 minute read)

DigitalOcean has made multi-registry support for its Container Registry (DOCR) generally available, allowing Professional Plan customers to create and manage up to 10 separate container registries under a single team at no additional cost. This enhancement provides greater flexibility for image rollout, environment isolation, regional performance, and regulatory compliance.
🚀

Opinions & Tutorials

How AirFrance-KLM built a secure automation platform at global scale with Terraform, Vault, and Ansible (6 minute read)

Air France-KLM reworked its automation platform using Terraform Enterprise, Vault, and Ansible to scale multi-cloud infrastructure securely while reducing cost and complexity. The shift from compliance-by-construction to policy-based guardrails cut provisioning to minutes, minimized errors, and enabled governance at scale.
When you have a hammer, everything looks like a nail (6 minute read)

Premature adoption of cloud, Kubernetes, serverless, or GenAI without understanding business needs leads to higher cost, complexity, and risk. Use outcome-driven architecture, implement explicit constraints, selectively choose technology, and prioritize simplicity over trends.
Keeping 20,000 GPUs healthy (8 minute read)

Modal runs a globally distributed pool of over 20,000 GPUs across major cloud providers using continuous benchmarking, standardized machine images, and layered GPU health checks. GPU failures are common at this scale. High uptime depends on automated detection, isolation, and rapid replacement rather than provider guarantees alone.
🧑‍💻

Resources & Tools

Gartner Report: When NOT to Use AI Agents (Sponsor)

AI agents are showing up everywhere. In the wrong use cases, they can become black boxes that break quietly and get blamed loudly. This Gartner report helps IT and DevOps leaders decide where agents are worth it and where traditional automation is the better move. Read the report →
ConvertX (GitHub Repo)

ConvertX is a self-hosted online file converter that supports over 1,000 different formats. Built with TypeScript, Bun, and Elysia, it is available as a Docker image on GitHub Container Registry and Docker Hub.
tunnelto (GitHub Repo)

tunnelto, a tool written in Rust with async-io on tokio, allows users to expose their locally running web servers via a public URL. The service's hosted version operates as a distributed system on fly.io, utilizing Private Networking and a gossip mechanism for its implementation.
🎁

Miscellaneous

Azure Policy: Required Actions for Docker Content Trust Deprecation in Azure Container Registry (2 minute read)

Azure Container Registry is deprecating Docker Content Trust over three years, removing the trustPolicy property from ARM APIs and affecting related Azure Policy aliases. Custom policies referencing these aliases must be updated or removed to avoid compliance issues, as new ACR resources will become non-compliant once the property is deprecated.
From Chaos to Scale: Templatizing Spark Declarative Pipelines with DLT-META (5 minute read)

DLT-META is a metadata-driven metaprogramming framework for Spark Declarative Pipelines designed to automate pipeline creation and standardize logic. The solution enables teams to scale data pipelines efficiently by centralizing configuration and dynamically generating logic at runtime, significantly reducing custom code and maintenance.
The coolest feature in Python 3.14 (3 minute read)

Python 3.14 introduces sys.remote_exec(), which allows code to be executed inside an already running Python process without restarting it. This capability enables tools like debugwand to attach a remote debugger to Python applications running in Docker or Kubernetes with no code changes or sidecars, relying only on runtime injection and configuration.

Quick Links

Amazon ECS now supports tmpfs mounts on AWS Fargate and ECS Managed Instances (2 minute read)

Amazon ECS now supports tmpfs mounts for Linux tasks on AWS Fargate and ECS Managed Instances, enabling memory-backed file systems for temporary, high-performance, and sensitive data.
Don't fall into the anti-AI hype (5 minute read)

Recent advances in large language models have fundamentally changed programming by enabling most code to be generated through prompting, shifting the primary skill from writing code to defining problems and reviewing solutions.

Love TLDR? Tell your friends and get rewards!

Share your referral link below with friends to get free TLDR swag!
Track your referrals here.

Want to advertise in TLDR? 📰

If your company is interested in reaching an audience of devops professionals and decision makers, you may want to advertise with us.

Want to work at TLDR? 💼

Apply here, create your own role or send a friend's resume to jobs@tldr.tech and get $1k if we hire them! TLDR is one of Inc.'s Best Bootstrapped businesses of 2025.

If you have any comments or feedback, just respond to this email!

Thanks for reading,
Kunal Desai & Martin Hauskrecht


Manage your subscriptions to our other newsletters on tech, startups, and programming. Or if TLDR DevOps isn't for you, please unsubscribe.