Recent Posts
Archives

Posts Tagged ‘DevoxxFR2013’

PostHeaderIcon [DevoxxFR2013] Strange Loops: A Mind-Bending Journey Through Java’s Hidden Curiosities

Lecturers

Guillaume Tardif has been crafting software since 1998, primarily in the Java and JEE ecosystem. His roles span technical leadership, agile coaching, and architecture across consulting firms and startups. Now an independent consultant, he has presented at Agile Conference 2009, XP Days 2009, and Devoxx France 2012, blending technical depth with philosophical inquiry.

Eric Lefevre-Ardant began programming in Java in 1996. His career alternates between Java consultancies and startups, currently as an independent consultant. Together, they explore the boundaries of code, inspired by Douglas Hofstadter’s Gödel, Escher, Bach.

Abstract

Guillaume Tardif and Eric Lefevre-Ardant invite you on a disorienting, delightful promenade through the strangest corners of the Java language — a journey inspired by Douglas Hofstadter’s exploration of self-reference, recursion, and emergent complexity. Through live-coded puzzles, optical illusions in syntax, and meta-programming mind-benders, they reveal how innocent-looking code can loop infinitely, reflect upon itself, or even generate its own source. The talk escalates from simple for loop quirks to genetic programming, culminating in a real-world example of self-replicating machines: the RepRap 3D printer. This is not a tutorial — it is a meditation on the nature of code, computation, and creation.

The Hofstadter Inspiration

Douglas Hofstadter’s Gödel, Escher, Bach explores strange loops — hierarchical systems that refer to themselves, creating emergent meaning. The presenters apply this lens to Java: a language designed for clarity, yet capable of profound self-referential trickery. They begin with a simple puzzle:

for (int i = 0; i < 10; i++) {
  System.out.println(i);
  i--;
}

What does it print? The answer — an infinite loop — reveals how loop variables can be manipulated in ways that defy intuition. This sets the tone: code is not just logic; it is perception.

Syntactic Illusions and Parser Tricks

The duo demonstrates Java constructs that appear valid but behave unexpectedly due to parser ambiguities. Consider:

label: for (int i = 0; i < 5; i++) {
  if (i == 3) break label;
  System.out.println(i);
}

The label: seems redundant — until combined with nested loops and continue label to skip outer iterations. They show how the most vexing parse confuses even experienced developers:

new Foo(new Bar());
// vs
new Foo(new Bar()); // same?

Subtle whitespace and operator precedence create optical illusions in code readability.

Reflection and Meta-Programming

Java’s reflection API enables programs to inspect and modify themselves at runtime. The presenters write a method that prints its own source code — a quines-like construct:

public static void printSource() throws Exception {
  String path = Quine.class.getProtectionDomain().getCodeSource().getLocation().getPath();
  Files.lines(Paths.get(path)).forEach(System.out::println);
}

They escalate to bytecode manipulation with Javassist, generating classes dynamically. This leads to a discussion of genetic programming: modeling source code as a tree, applying mutations and crossovers, and evolving solutions. While more natural in Lisp, Java implementations exist using AST parsing and code generation.

The Ultimate Strange Loop: Self-Replicating Machines

The talk culminates with the RepRap project — an open-source 3D printer designed to print its own parts. Begun in 2005, RepRap achieved partial self-replication by 2008, printing about 50% of its components. The presenters display a physical model, explaining how the printer’s design files, firmware, and mechanical parts form a closed loop of creation.

They draw parallels to John von Neumann’s self-replicating machines and Conway’s Game of Life — systems where simple rules generate infinite complexity. In Java terms, this is the ultimate quine: a program that outputs a machine that runs the program.

Philosophical Implications

What does it mean for code to reflect, replicate, or evolve? The presenters argue that programming is not just engineering — it is art, philosophy, and exploration. Strange loops remind us that:

  • Clarity can mask complexity
  • Simplicity can generate infinity
  • Code can transcend its creator

They close with a call to embrace curiosity: write a quine, mutate an AST, print a 3D part. The joy of programming lies not in solving known problems, but in discovering new ones.

Links

Hashtags: #StrangeLoops #JavaPuzzlers #SelfReference #GeneticProgramming #RepRap #GuillaumeTardif #EricLefevreArdant

PostHeaderIcon [DevoxxFR2013] Lily: Big Data for Dummies – A Comprehensive Journey into Democratizing Apache Hadoop and HBase for Enterprise Java Developers

Lecturers

Steven Noels stands as one of the most visionary figures in the evolution of open-source Java ecosystems, having co-founded Outerthought in the early 2000s with a mission to push the boundaries of content management, RESTful architecture, and scalable data systems. His flagship creation, Daisy CMS, became a cornerstone for large-scale, multilingual content platforms used by governments and global enterprises, demonstrating that Java could power mission-critical, document-centric applications at internet scale. But Noels’ ambition extended far beyond traditional CMS. Recognizing the seismic shift toward big data in the late 2000s, he pivoted Outerthought—and later NGDATA—toward building tools that would make the Apache Hadoop ecosystem accessible to the average enterprise Java developer. Lily, launched in 2010, was the culmination of this vision: a platform that wrapped the raw power of HBase and Solr into a cohesive, Java-friendly abstraction layer, eliminating the need for MapReduce expertise or deep systems programming.

Bruno Guedes, an enterprise Java architect at SFEIR with over a decade of experience in distributed systems and search infrastructure, brought the practitioner’s perspective to the stage. Having worked with Lily from its earliest alpha versions, Guedes had deployed it in production environments handling millions of records, integrating it with legacy Java EE applications, Spring-based services, and real-time analytics pipelines. His hands-on experience—debugging schema migrations, tuning SolrCloud clusters, and optimizing HBase compactions—gave him unique insight into both the promise and the pitfalls of big data adoption in conservative enterprise settings. Together, Noels and Guedes formed a perfect synergy: the visionary architect and the battle-tested engineer, delivering a presentation that was equal parts inspiration and practical engineering.

Abstract

This article represents an exhaustively elaborated, deeply extended, and comprehensively restructured expansion of Steven Noels and Bruno Guedes’ seminal 2012 DevoxxFR presentation, “Lily, Big Data for Dummies”, transformed into a definitive treatise on the democratization of big data technologies for the Java enterprise. Delivered in a bilingual format that reflected the global nature of the Apache community, the original talk introduced Lily as a groundbreaking platform that unified Apache HBase’s scalable, distributed storage with Apache Solr’s full-text search and analytics capabilities, all through a clean, type-safe Java API. The core promise was radical in its simplicity: enterprise Java developers could build petabyte-scale, real-time searchable data systems without writing a single line of MapReduce, without mastering Zookeeper quorum mechanics, and without abandoning the comforts of POJOs, annotations, and IDE autocompletion.

This expanded analysis delves far beyond the original demo to explore the philosophical foundations of Lily’s design, the architectural trade-offs in integrating HBase and Solr, the real-world production patterns that emerged from early adopters, and the lessons learned from scaling Lily to billions of records. It includes detailed code walkthroughs, performance benchmarks, schema evolution strategies, and failure mode analyses.

EDIT:
Updated for the 2025 landscape, this piece maps Lily’s legacy concepts to modern equivalents—Apache HBase 2.5, SolrCloud 9, OpenSearch, Delta Lake, Trino, and Spring Data Hadoop—while preserving the original vision of big data for the rest of us. Through rich narratives, architectural diagrams, and forward-looking speculation, this work serves not just as a historical archive, but as a practical guide for any Java team contemplating the leap into distributed, searchable big data systems.

The Big Data Barrier in 2012: Why Hadoop Was Hard for Java Developers

To fully grasp Lily’s significance, one must first understand the state of big data in 2012. The Apache Hadoop ecosystem—launched in 2006—was already a proven force in internet-scale companies like Yahoo, Facebook, and Twitter. HDFS provided fault-tolerant, distributed storage. MapReduce offered a programming model for batch processing. HBase, modeled after Google’s Bigtable, delivered random, real-time read/write access to massive datasets. And Solr, forked from Lucene, powered full-text search at scale.

Yet for the average enterprise Java developer, this stack was inaccessible. Writing a MapReduce job required:
– Learning a functional programming model in Java that felt alien to OO practitioners.
– Mastering job configuration, input/output formats, and partitioners.
– Debugging distributed failures across dozens of nodes.
– Waiting minutes to hours for job completion.

HBase, while promising real-time access, demanded:
– Manual row key design to avoid hotspots.
– Deep knowledge of compaction, splitting, and region server tuning.
– Integration with Zookeeper for coordination.

Solr, though more familiar, required:
– Separate schema.xml and solrconfig.xml files.
– Manual index replication and sharding.
– Complex commit and optimization strategies.

The result? Big data remained the domain of specialized data engineers, not the Java developers who built the business logic. Lily was designed to change that.

Lily’s Core Philosophy: Big Data as a First-Class Java Citizen

At its heart, Lily was built on a simple but powerful idea: big data should feel like any other Java persistence layer. Just as Spring Data made MongoDB, Cassandra, or Redis accessible via repositories and annotations, Lily aimed to make HBase and Solr feel like JPA with superpowers.

The Three Pillars of Lily

Steven Noels articulated Lily’s architecture in three interconnected layers:

  1. The Storage Layer (HBase)
    Lily used HBase as its primary persistence engine, storing all data as versioned, column-family-based key-value pairs. But unlike raw HBase, Lily abstracted away row key design, column family management, and versioning policies. Developers worked with POJOs, and Lily handled the mapping.

  2. The Indexing Layer (Solr)
    Every mutation in HBase triggered an asynchronous indexing event to Solr. Lily maintained tight consistency between the two systems, ensuring that search results reflected the latest data within milliseconds. This was achieved through a message queue (Kafka or RabbitMQ) and idempotent indexing.

  3. The Java API Layer
    The crown jewel was Lily’s type-safe, annotation-driven API. Developers defined their data model using plain Java classes:

@LilyRecord
public class Customer {
    @LilyId
    private String id;

    @LilyField(family = "profile")
    private String name;

    @LilyField(family = "profile")
    private int age;

    @LilyField(family = "activity", indexed = true)
    private List<String> recentSearches;

    @LilyFullText
    private String bio;
}

The @LilyRecord annotation told Lily to persist this object in HBase. @LilyField specified column families and indexing behavior. @LilyFullText triggered Solr indexing. No XML. No schema files. Just Java.

The Lily Repository: Spring Data, But for Big Data

Lily’s LilyRepository interface was modeled after Spring Data’s CrudRepository, but with big data superpowers:

public interface CustomerRepository extends LilyRepository<Customer, String> {
    List<Customer> findByName(String name);

    @Query("age:[* TO 30]")
    List<Customer> findYoungCustomers();

    @Query("bio:java AND recentSearches:hadoop")
    List<Customer> findJavaHadoopEnthusiasts();
}

Behind the scenes, Lily:
– Translated method names to HBase scans.
– Converted @Query annotations to Solr queries.
– Executed searches across sharded SolrCloud clusters.
– Returned fully hydrated POJOs.

Bruno Guedes demonstrated this in a live demo:

CustomerRepository repo = lily.getRepository(CustomerRepository.class);
repo.save(new Customer("1", "Alice", 28, Arrays.asList("java", "hadoop"), "Java dev at NGDATA"));
List<Customer> results = repo.findJavaHadoopEnthusiasts();

The entire operation—save, index, search—took under 50ms on a 3-node cluster.

Under the Hood: How Lily Orchestrated HBase and Solr

Lily’s magic was in its orchestration layer. When a save() was called:
1. The POJO was serialized to HBase Put operations.
2. The mutation was written to HBase with a version timestamp.
3. A change event was published to a message queue.
4. A Solr indexer consumed the event and updated the search index.
5. Near-real-time consistency was guaranteed via HBase’s WAL and Solr’s soft commits.

For reads:
findById → HBase Get.
findByName → HBase scan with secondary index.
@Query → Solr query with HBase post-filtering.

This dual-write, eventual consistency model was a deliberate trade-off for performance and scalability.

Schema Evolution and Versioning: The Enterprise Reality

One of Lily’s most enterprise-friendly features was schema evolution. In HBase, adding a column family requires manual admin intervention. In Lily, it was automatic:

// Version 1
@LilyField(family = "profile")
private String email;

// Version 2
@LilyField(family = "profile")
private String phone; // New field, no migration needed

Lily stored multiple versions of the same record, allowing old code to read new data and vice versa. This was critical for rolling deployments in large organizations.

Production Patterns and Anti-Patterns

Bruno Guedes shared war stories from production:
Hotspot avoidance: Never use auto-incrementing IDs. Use hashed or UUID-based keys.
Index explosion: @LilyFullText on large fields → Solr bloat. Use @LilyField(indexed = true) for structured search.
Compaction storms: Schedule major compactions during low traffic.
Zookeeper tuning: Increase tick time for large clusters.

The Lily Ecosystem in 2012

Lily shipped with:
Lily CLI for schema inspection and cluster management.
Lily Maven Plugin for deploying schemas.
Lily SolrCloud Integration with automatic sharding.
Lily Kafka Connect for streaming data ingestion.

Lily’s Legacy After 2018: Where the Ideas Live On

EDIT
Although Lily itself was archived in 2018, its core concepts continue to thrive in modern tools.

The original HBase POJO mapping is now embodied in Spring Data Hadoop.

Lily’s Solr integration has evolved into SolrJ + OpenSearch.

The repository pattern that Lily pioneered is carried forward by Spring Data R2DBC.

Schema evolution, once a key Lily feature, is now handled by Apache Atlas.

Finally, Lily’s near-real-time search capability lives on through the Elasticsearch Percolator.

Conclusion: Big Data Doesn’t Have to Be Hard

Steven Noels closed with a powerful message:

“Big data is not about MapReduce. It’s not about Zookeeper. It’s about solving business problems at scale. Lily proved that Java developers can do that—without becoming data engineers.”

EDIT:
In 2025, as lakehouse architectures, real-time analytics, and AI-driven search dominate, Lily’s vision of big data as a first-class Java citizen remains more relevant than ever.

Links

PostHeaderIcon [DevoxxFR2013] MongoDB and Mustache: Toward the Death of the Cache? A Comprehensive Case Study in High-Traffic, Real-Time Web Architecture

Lecturers

Mathieu Pouymerol and Pierre Baillet were the technical backbone of Fotopedia, a photo-sharing platform that, at its peak, served over five million monthly visitors using a Ruby on Rails application that had been in production for six years. Mathieu, armed with degrees from École Centrale Paris and a background in building custom data stores for dictionary publishers, brought a deep understanding of database design, indexing, and performance optimization. Pierre, also from Centrale and with experience at Cambridge, had spent nearly a decade managing infrastructure, tuning Tomcat, configuring memcached, and implementing geoDNS systems. Together, they faced the ultimate challenge: keeping a legacy Rails monolith responsive under massive, unpredictable traffic while maintaining content freshness and developer velocity.

Abstract

This article presents an exhaustively detailed expansion of Mathieu Pouymerol and Pierre Baillet’s 2012 DevoxxFR presentation, “MongoDB et Mustache, vers la mort du cache ?”, reimagined as a definitive case study in high-traffic web architecture and the evolution of caching strategies. The Fotopedia team inherited a Rails application plagued by slow ORM queries, complex cache invalidation logic, and frequent stale data. Their initial response—edge-side includes (ESI), fragment caching, and multi-layered memcached—bought time but introduced fragility and operational overhead. The breakthrough came from a radical rethinking: use MongoDB as a real-time document store and Mustache as a logic-less templating engine to assemble pages dynamically, eliminating cache for the most volatile content.

This analysis walks through every layer of their architecture: from database schema design to template composition, from CDN integration to failure mode handling. It includes performance metrics, post-mortem analyses, and lessons learned from production incidents. Updated for 2025, it maps their approach to modern tools: MongoDB 7.0 with Atlas, server-side rendering with HTMX, edge computing via Cloudflare Workers, and Spring Boot with Mustache, offering a complete playbook for building cache-minimized, real-time web applications at scale.

The Legacy Burden: A Rails Monolith Under Siege

Fotopedia’s core application was built on Ruby on Rails 2.3, a framework that, while productive for startups, began to show its age under heavy load. The database layer relied on MySQL with aggressive sharding and replication, but ActiveRecord queries were slow, and joins across shards were impractical. The presentation layer used ER 15–20 partials per page, each with its own caching logic. The result was a cache dependency graph so complex that a single user action—liking a photo—could invalidate dozens of cache keys across multiple servers.

The team’s initial strategy was defense in depth:
Varnish at the edge with ESI for including dynamic fragments.
Memcached for fragment and row-level caching.
Custom invalidation daemons to purge stale cache entries.

But this created a house of cards. A missed invalidation led to stale comments. A cache stampede during a traffic spike brought the database to its knees. As Pierre put it, “We were not caching to improve performance. We were caching to survive.”

The Paradigm Shift: Real-Time Data with MongoDB

The turning point came when the team migrated dynamic, user-generated content—photos, comments, tags, likes—to MongoDB. Unlike MySQL, MongoDB stored data as flexible JSON-like documents, allowing embedded arrays and atomic updates:

{
  "_id": "photo_123",
  "title": "Sunset",
  "user_id": "user_456",
  "tags": ["paris", "sunset"],
  "likes": 1234,
  "comments": [
    { "user": "Alice", "text": "Gorgeous!", "timestamp": "2013-04-01T12:00:00Z" }
  ]
}

This schema eliminated joins and enabled single-document reads for most pages. Updates used atomic operators:

db.photos.updateOne(
  { _id: "photo_123" },
  { $inc: { likes: 1 }, $push: { comments: { user: "Bob", text: "Nice!" } } }
);

Indexes on user_id, tags, and timestamp ensured sub-millisecond query performance.

Mustache: The Logic-Less Templating Revolution

The second pillar was Mustache, a templating engine that enforced separation of concerns by allowing no logic in templates—only iteration and conditionals:

{{#photo}}
  <h1>{{title}}</h1>
  <img src="{{url}}" alt="{{title}}" />
  <p>By {{user.name}} • {{likes}} likes</p>
  <ul class="comments">
    {{#comments}}
      <li><strong>{{user}}</strong>: {{text}}</li>
    {{/comments}}
  </ul>
{{/photo}}

Because templates contained no business logic, they could be cached indefinitely in Varnish. Only the data changed—and that came fresh from MongoDB on every request.

data = mongo.photos.find(_id: params[:id]).first
html = Mustache.render(template, data)

The Hybrid Architecture: Cache Where It Makes Sense

The final system was a hybrid of caching and real-time rendering:
Static assets (CSS, JS, images) → CDN with long TTL.
Static page fragments (headers, footers, sidebars) → Varnish ESI with 1-hour TTL.
Dynamic content (photo, comments, likes) → MongoDB + Mustache, no cache.

This reduced cache invalidation surface by 90% and average response time from 800ms to 180ms.

2025: The Evolution of Cache-Minimized Architecture

EDIT:
The principles pioneered by Fotopedia are now mainstream:
Server-side rendering with HTMX for dynamic updates.
Edge computing with Cloudflare Workers to assemble pages.
MongoDB Atlas with change streams for real-time UIs.
Spring Boot + Mustache for Java backends.

Links

PostHeaderIcon [DevoxxFR2013] Clean JavaScript? Challenge Accepted: Strategies for Maintainable Large-Scale Applications

Lecturer

Romain Linsolas is a Java developer with over two decades of experience, passionate about technical innovation. He has worked at the CNRS on an astrophysics project, as a consultant at Valtech, and as a technical leader at Société Générale. Romain is actively involved in the developpez.com community as a writer and moderator, and he focuses on continuous integration principles to automate and improve team processes. Julien Jakubowski is a consultant and lead developer at OCTO Technology, with a decade of experience helping teams deliver high-quality software efficiently. He co-founded the Ch’ti JUG in Lille and has organized the Agile Tour Lille for two years.

Abstract

This article analyzes Romain Linsolas and Julien Jakubowski’s exploration of evolving JavaScript from rudimentary scripting to robust, large-scale application development. By dissecting historical pitfalls and modern solutions, the discussion evaluates architectural patterns, testing frameworks, and automation tools that enable clean, maintainable code. Contextualized within the shift from server-heavy Java applications to client-side dynamism, the analysis assesses methodologies for avoiding common errors, implications for developer productivity, and challenges in integrating diverse ecosystems. Through practical examples, it illustrates how JavaScript can support complex projects without compromising quality.

Historical Pitfalls and the Evolution of JavaScript Practices

JavaScript’s journey from a supplementary tool in the early 2000s to a cornerstone of modern web applications reflects broader shifts in user expectations and technology. Initially, developers like Romain and Julien used JavaScript for minor enhancements, such as form validations or visual effects, within predominantly Java-based server-side architectures. A typical 2003 example involved inline scripts to check input fields, turning them red on errors and preventing form submission. However, this approach harbored flaws: global namespace pollution from duplicated function names across files, implicit type coercions leading to unexpected concatenations instead of additions (e.g., “100” + 0.19 yielding “1000.19”), and public access to supposedly private variables, breaking encapsulation.

These issues stem from JavaScript’s design quirks, often labeled “dirty” due to surprising behaviors like empty array additions resulting in strings or NaN (Not a Number). Romain’s demonstrations, inspired by Gary Bernhardt’s critiques, highlight arithmetic anomalies where [] + {} equals “[object Object]” but {} + [] yields 0. Such inconsistencies, while entertaining, pose real risks in production code, as seen in scope leakage where loop variables overwrite each other, printing values only 10 times instead of 100.

The proliferation of JavaScript-driven applications, fueled by innovations from Gmail and Google Docs, necessitated more code—potentially 100,000 lines—demanding structured approaches. Early reliance on frameworks like Struts for server logic gave way to client-side demands for offline functionality and instant responsiveness, compelling developers to confront JavaScript’s limitations head-on.

Architectural Patterns for Scalable Code

To tame JavaScript’s chaos, modular architectures inspired by Model-View-Controller (MVC) patterns emerge as key. Frameworks like Backbone.js, AngularJS, and Ember.js facilitate separation of concerns: models handle data, views manage UI, and controllers orchestrate logic. For instance, in a beer store application, an MVC setup might use Backbone to define a Beer model with validation, a BeerView for rendering, and a controller to handle additions.

Modularization via patterns like the Module Pattern encapsulates code, preventing global pollution. A counter example encapsulates a private variable:

var Counter = (function() {
    var privateCounter = 0;
    function changeBy(val) {
        privateCounter += val;
    }
    return {
        increment: function() {
            changeBy(1);
        },
        value: function() {
            return privateCounter;
        }
    };
})();

This ensures privacy, unlike direct access in naive implementations. Advanced libraries like RequireJS implement Asynchronous Module Definition (AMD), loading dependencies on demand to avoid conflicts.

Expressivity is boosted by frameworks like CoffeeScript, which compiles to JavaScript with cleaner syntax, or Underscore.js for functional utilities. Julien’s analogy to appreciating pungent cheese after initial aversion captures the learning curve: mastering these tools reveals JavaScript’s elegance.

Testing and Automation for Reliability

Unit testing, absent in early practices, is now feasible with frameworks like Jasmine, adopting Behavior-Driven Development (BDD). Specs describe behaviors clearly:

describe("Beer addition", function() {
    it("should add a beer with valid name", function() {
        var beer = new Beer({name: "IPA"});
        expect(beer.isValid()).toBe(true);
    });
});

Tools like Karma run tests in real browsers, while Istanbul measures coverage. Automation integrates via Maven, Jenkins, or SonarQube, mirroring Java workflows. Violations from JSLint or compilation errors from Google Closure Compiler are flagged, ensuring syntax integrity.

Yeoman, combining Yo (scaffolding), Grunt (task running), and Bower (dependency management), streamlines setup. IDEs like IntelliJ or WebStorm provide seamless support, with Chrome DevTools for debugging.

Ongoing Challenges and Future Implications

Despite advancements, integration remains complex: combining MVC frameworks with testing suites requires careful orchestration, often involving custom recipes. Perennial concerns include framework longevity—Angular vs. Backbone—and team upskilling, demanding substantial training investments.

The implications are profound: clean JavaScript enables scalable, responsive applications, bridging Java developers into full-stack roles. By avoiding pitfalls through patterns and tools, projects achieve maintainability, reducing long-term costs. However, the ecosystem’s youth demands vigilance, as rapid evolutions could obsolete choices.

In conclusion, JavaScript’s transformation empowers developers to tackle ambitious projects confidently, blending familiarity with innovation for superior outcomes.

Links:

PostHeaderIcon [DevoxxFR2013] From Cloud Experimentation to On-Premises Maturity: Strategic Infrastructure Repatriation at Mappy

Lecturer

Cyril Morcrette serves as Technical Director at Mappy, a pioneering French provider of geographic and local commerce services with thirteen million euros in annual revenue and eighty employees. Under his leadership, Mappy has evolved from a traditional route planning service into a comprehensive platform integrating immersive street-level imagery, local business discovery, and personalized recommendations. His infrastructure strategy reflects deep experience with both cloud and on-premises environments, informed by multiple large-scale projects that pushed technological boundaries.

Abstract

Cloud computing excels at enabling rapid prototyping and handling uncertain demand, but its cost structure can become prohibitive as projects mature and usage patterns stabilize. This presentation chronicles Mappy’s journey with immersive geographic visualization — a direct competitor to Google Street View — from initial cloud deployment to eventual repatriation to on-premises infrastructure. Cyril Morcrette examines the economic, operational, and technical factors that drove this decision, providing a framework for evaluating infrastructure choices throughout the application lifecycle. Through detailed cost analysis, performance metrics, and migration case studies, he demonstrates that cloud is an ideal launch platform but often not the optimal long-term home for predictable, high-volume workloads. The session concludes with practical guidance for smooth repatriation and the broader implications for technology strategy in established organizations.

The Immersive Visualization Imperative

Mappy’s strategic pivot toward immersive geographic experiences required capabilities beyond traditional mapping: panoramic street-level imagery, 3D reconstruction, and real-time interaction. The project demanded massive storage (terabytes of high-resolution photos), significant compute for image processing, and low-latency delivery to users.

Initial estimates suggested explosive, unpredictable traffic growth. Marketing teams envisioned viral adoption, while technical teams worried about infrastructure bottlenecks. Procuring sufficient on-premises hardware would require months of lead time and capital approval — unacceptable for a market-moving initiative.

Amazon Web Services offered an immediate solution: spin up instances, store petabytes in S3, process imagery with EC2 spot instances. The cloud’s pay-as-you-go model eliminated upfront investment and provided virtually unlimited capacity.

Cloud-First Development: Speed and Agility

The project launched entirely in AWS. Development teams used EC2 for processing pipelines, S3 for raw and processed imagery, CloudFront for content delivery, and Elastic Load Balancing for web servers. Auto-scaling handled traffic spikes during marketing campaigns.

This environment enabled rapid iteration:
– Photographers uploaded imagery directly to S3 buckets
– Lambda functions triggered processing workflows
– Machine learning models (running on GPU instances) detected business facades and extracted metadata
– Processed panoramas were cached in CloudFront edge locations

Within months, Mappy delivered a functional immersive experience covering major French cities. The cloud’s flexibility absorbed the uncertainty of early adoption while development teams refined algorithms and user interfaces.

The Economics of Maturity

As the product stabilized, usage patterns crystallized. Daily active users grew steadily but predictably. Storage requirements, while large, increased linearly. Processing workloads became batch-oriented rather than real-time.

Cost analysis revealed a stark reality: cloud expenses were dominated by data egress, storage, and compute hours — all now predictable and substantial. Mappy’s existing data center, built for core mapping services, had significant spare capacity with fully amortized hardware.

Cyril presents the tipping point calculation:
Cloud monthly cost: €45,000 (storage, compute, bandwidth)
On-premises equivalent: €12,000 (electricity, maintenance, depreciation)
Break-even: four months

The decision to repatriate was driven by simple arithmetic, but execution required careful planning.

Repatriation Strategy and Execution

The migration followed a phased approach:

  1. Data Transfer: Used AWS Snowball devices to move petabytes of imagery back to on-premises storage. Parallel uploads leveraged Mappy’s high-bandwidth connectivity.

  2. Processing Pipeline: Reimplemented image processing workflows on internal GPU clusters. Custom scripts replaced Lambda functions, achieving equivalent throughput at lower cost.

  3. Web Tier: Deployed Nginx and Varnish caches on existing web servers. CDN integration with Akamai preserved low-latency delivery.

  4. Monitoring and Automation: Migrated CloudWatch metrics to Prometheus/Grafana. Ansible playbooks replaced CloudFormation templates.

Performance remained comparable: page load times stayed under two seconds, and system availability exceeded 99.95%. The primary difference was cost — reduced by seventy-five percent.

Operational Benefits of On-Premises Control

Beyond economics, repatriation delivered strategic advantages:
Data Sovereignty: Full control over sensitive geographic imagery
Performance Predictability: Eliminated cloud provider throttling risks
Integration Synergies: Shared infrastructure with core mapping services reduced operational complexity
Skill Leverage: Existing systems administration expertise applied directly

Cyril notes that while cloud elasticity was lost, the workload’s maturity rendered it unnecessary. Capacity planning became straightforward, with hardware refresh cycles aligned to multi-year budgets.

Lessons for Infrastructure Strategy

Mappy’s experience yields a generalizable framework:
1. Use cloud for uncertainty: Prototyping, viral growth potential, or seasonal spikes
2. Monitor cost drivers: Storage, egress, compute hours
3. Model total cost of ownership: Include migration effort and operational overhead
4. Plan repatriation paths: Design applications with infrastructure abstraction
5. Maintain hybrid capability: Keep cloud skills current for future needs

The cloud is not a destination but a tool — powerful for certain phases, less optimal for others.

Conclusion: Right-Sizing Infrastructure for Business Reality

Mappy’s journey from cloud experimentation to on-premises efficiency demonstrates that infrastructure decisions must evolve with product maturity. The cloud enabled rapid innovation and market entry, but long-term economics favored internal hosting for stable, high-volume workloads. Cyril’s analysis provides a blueprint for technology leaders to align infrastructure with business lifecycle stages, avoiding the trap of cloud religion or on-premises dogma. The optimal stack combines both environments strategically, using each where it delivers maximum value.

Links:

PostHeaderIcon [DevoxxFR2013] Developing Modern Web Apps with Backbone.js: A Live-Coded Journey from Empty Directory to Production-Ready SPA

Lecturer

Sylvain Zimmer represents the rare fusion of hacker spirit and entrepreneurial vision. In 2004, he launched Jamendo, which grew into the world’s largest platform for Creative Commons-licensed music, proving that open content could sustain a viable business model and empower artists globally. He co-founded Joshfire, a Paris-based agency specializing in connected devices and IoT solutions, and TEDxParis, democratizing access to transformative ideas. His competitive prowess shone in 2011 when his team won the Node Knockout competition in the Completeness category with Chess@home — a fully distributed chess AI implemented entirely in JavaScript, showcasing the language’s maturity for complex, real-time systems. Recognized as one of the first Google Developer Experts for HTML5, Sylvain recently solved a cryptographically hidden equation embedded in a Chromebook advertisement, demonstrating his blend of technical depth and puzzle-solving acumen. His latest venture, Pressing, continues his pattern of building elegant, user-centric solutions that bridge technology and human needs.

Abstract

In this intensely practical, code-only presentation, Sylvain Zimmer constructs a fully functional single-page application using Backbone.js from an empty directory to a polished, interactive demo in under thirty minutes. He orchestrates a modern frontend toolchain including Yeoman for project scaffolding, Grunt for task automation, LiveReload for instantaneous feedback, RequireJS for modular dependency management, and a curated selection of Backbone extensions to address real-world complexity. The session is a masterclass in architectural decision-making, demonstrating how to structure code for maintainability, scalability, and testability while avoiding the pitfalls of framework bloat. Attendees witness the evolution of a simple task manager into a sophisticated, real-time collaborative application, learning not just Backbone’s core MVC patterns but the entire ecosystem of best practices that define professional frontend engineering in the modern web era.

The Modern Frontend Development Loop: Zero Friction from Code to Browser

Sylvain initiates the journey with yo backbone, instantly materializing a complete project structure:

app/
  scripts/
    models/      collections/      views/      routers/
  styles/
  index.html
  Gruntfile.js

This scaffold is powered by Yeoman, which embeds Grunt as the task runner and LiveReload for automatic browser refresh. Every file save triggers a cascade of actions — CoffeeScript compilation, Sass preprocessing, JavaScript minification, and live injection into the browser — creating a development feedback loop with near-zero latency. This environment is not a convenience; it is a fundamental requirement for maintaining flow state and rapid iteration in modern web development.

Backbone Core Concepts: Models, Collections, Views, and Routers in Harmony

The application begins with a Task model that encapsulates state and behavior:

var Task = Backbone.Model.extend({
  defaults: {
    title: '',
    completed: false,
    priority: 'medium'
  },
  toggle: function() {
    this.save({ completed: !this.get('completed') });
  },
  validate: function(attrs) {
    if (!attrs.title.trim()) return "Title required";
  }
});

A TaskList collection manages persistence and business logic:

var TaskList = Backbone.Collection.extend({
  model: Task,
  localStorage: new Backbone.LocalStorage('tasks-backbone'),
  completed: function() { return this.where({completed: true}); },
  remaining: function() { return this.where({completed: false}); },
  comparator: 'priority'
});

The TaskView handles rendering and interaction using Underscore templates:

var TaskView = Backbone.View.extend({
  tagName: 'li',
  template: _.template($('#task-template').html()),
  events: {
    'click .toggle': 'toggleCompleted',
    'dblclick label': 'edit',
    'blur .edit': 'close',
    'keypress .edit': 'updateOnEnter'
  },
  initialize: function() {
    this.listenTo(this.model, 'change', this.render);
    this.listenTo(this.model, 'destroy', this.remove);
  },
  render: function() {
    this.$el.html(this.template(this.model.toJSON()));
    this.$el.toggleClass('completed', this.model.get('completed'));
    return this;
  }
});

An AppRouter enables clean URLs and state management:

var AppRouter = Backbone.Router.extend({
  routes: {
    '': 'index',
    'tasks/:id': 'show',
    'filter/:status': 'filter'
  },
  index: function() { /* render all tasks */ },
  filter: function(status) { /* update collection filter */ }
});

RequireJS: Enforcing Modularity and Asynchronous Loading Discipline

Global scope pollution is eradicated through RequireJS, configured in main.js:

require.config({
  paths: {
    'jquery': 'libs/jquery',
    'underscore': 'libs/underscore',
    'backbone': 'libs/backbone',
    'localstorage': 'libs/backbone.localStorage'
  },
  shim: {
    'underscore': { exports: '_' },
    'backbone': { deps: ['underscore', 'jquery'], exports: 'Backbone' }
  }
});

Modules are defined with explicit dependencies:

define(['views/task', 'collections/tasks'], function(TaskView, taskList) {
  return new TaskView({ collection: taskList });
});

This pattern ensures lazy loading, parallel downloads, and clear dependency graphs, critical for performance in large applications.

Backbone Extensions: Scaling from Prototype to Enterprise with Targeted Plugins

Backbone’s minimalism is a feature, not a limitation. Sylvain integrates extensions judiciously:

  • Backbone.LayoutManager: Manages nested views and layout templates, preventing memory leaks
  • Backbone.Paginator: Implements infinite scrolling with server or client pagination
  • Backbone.Relational: Handles one-to-many and many-to-many relationships with cascading saves
  • Backbone.Validation: Enforces model constraints with customizable error messages
  • Backbone.Stickit: Provides declarative two-way data binding for forms
  • Backbone.IOBind: Synchronizes models in real-time via Socket.IO

He demonstrates a live collaboration feature: when one user completes a task, a WebSocket event triggers an immediate UI update for all connected clients, showcasing real-time capabilities without server polling.

Architectural Best Practices: Building for the Long Term

The final application adheres to rigorous principles:

  • Single responsibility principle: Each view manages exactly one DOM element
  • Event-driven architecture: No direct DOM manipulation outside views
  • Separation of concerns: Models handle business logic, views handle presentation
  • Testability: Components are framework-agnostic and unit-testable with Jasmine or Mocha
  • Progressive enhancement: Core functionality works without JavaScript

Sylvain stresses that Backbone is a foundation, not a monolith — choose extensions based on specific needs, not trends.

Ecosystem and Learning Resources

He recommends Addy Osmani’s Backbone Fundamentals as the definitive free guide, the official Backbone.js documentation for reference, and GitHub for discovering community plugins. Tools like Marionette.js (application framework) and Thorax (Handlebars integration) are highlighted for larger projects.

The Broader Implications: Backbone in the Modern Frontend Landscape

While newer frameworks like Angular and React dominate headlines, Backbone remains relevant for its predictability, flexibility, and small footprint. It teaches fundamental MVC patterns that translate to any framework. Sylvain positions it as ideal for teams needing fine-grained control, gradual adoption, or integration with legacy systems.

Conclusion: From Demo to Deployable Reality

In under thirty minutes, Sylvain has built a production-ready SPA with real-time collaboration, offline storage, and modular architecture. He challenges attendees to fork the code, extend it, and ship something real. The tools are accessible, the patterns are proven, and the only barrier is action.

Links

PostHeaderIcon [DevoxxFR2013] Soon, in a Galaxy Not So Far Away: Real-Time Web with Play 2, Akka, and Spaceships

Lecturer

Mathieu Ancelin is a software engineer at SERLI, specializing in Java EE technologies with a particular focus on component frameworks. He contributes to open-source projects such as GlassFish, JOnAS, and leads initiatives like CDI-OSGi and Play CDI. A member of the JSR 346 expert group for CDI 1.1, Ancelin regularly teaches at the University of La Rochelle and Poitiers, and speaks at conferences including JavaOne and Solutions Linux. He is active in the Poitou-Charentes JUG and can be followed on Twitter as @TrevorReznik.

Abstract

Mathieu Ancelin demystifies Play 2’s real-time capabilities, answering the perennial question: “WTF are Iteratees?” Through live demonstrations of two playful applications—a multiplayer spaceship battle and a real-time roulette game—he showcases how Play 2 leverages Iteratees, Akka actors, Server-Sent Events (SSE), WebSockets, HTML5 Canvas, and even webcam input to build responsive, interactive web experiences. The session explores how these APIs integrate seamlessly with Java and Scala, enabling developers to create low-latency, event-driven systems using their preferred language. Beyond the fun, Ancelin analyzes architectural patterns for scalability, backpressure handling, and state management in real-time web applications.

Demystifying Iteratees: Functional Streams for Non-Blocking I/O

Ancelin begins by addressing the confusion surrounding Iteratees, a functional reactive programming abstraction in Play 2. Unlike traditional imperative streams, Iteratees separate data production, processing, and consumption, enabling composable, backpressure-aware pipelines.

val enumeratee: Enumeratee[Array[Byte], String] = Enumeratee.map[Array[Byte]].apply[String] { bytes =>
  new String(bytes, "UTF-8")
}

This allows safe handling of chunked HTTP input without blocking threads. When combined with Enumerators (producers) and Enumeratees (transformers), they form robust data flows:

val socket: WebSocket[JsValue, JsValue] = WebSocket.using[JsValue] { request =>
  val in = Iteratee.foreach[JsValue](msg => actor ! msg).map(_ => actor ! PoisonPill)
  val out = Enumerator.fromCallback1(_ => futurePromise.future)
  (in, out)
}

Ancelin demonstrates how this pattern prevents memory leaks and thread exhaustion under load.

Akka Actors: Coordinating Game State and Player Actions

The spaceship game uses Akka actors to manage shared game state. A central GameActor maintains positions, velocities, and collisions:

class GameActor extends Actor {
  var players = Map.empty[String, Player]
  val ticker = context.system.scheduler.schedule(0.millis, 16.millis, self, Tick)

  def receive = {
    case Join(id, out) => players += (id -> Player(out))
    case Input(id, thrust, rotate) => players(id).update(thrust, rotate)
    case Tick => broadcastState()
  }
}

Each client connects via WebSocket, sending input events and receiving rendered frames. The actor model ensures thread-safe updates and natural distribution.

Real-Time Rendering with Canvas and Webcam Integration

The game renders on HTML5 Canvas using client-side JavaScript. Server pushes state via SSE or WebSocket; client interpolates between ticks for smooth 60 FPS animation.

A bonus feature uses getUserMedia() to capture webcam input, mapping head tilt to ship rotation—an engaging demo of sensor fusion in the browser.

navigator.getUserMedia({ video: true }, stream => {
  video.src = URL.createObjectURL(stream);
  tracker.on('track', event => sendRotation(event.data.angle));
});

Play Roulette: SSE for Unidirectional Live Updates

The second demo, Play Roulette, uses Server-Sent Events to broadcast spin results to all connected clients:

def live = Action {
  Ok.chunked(results via EventSource()).as("text/event-stream")
}

Clients subscribe with:

const es = new EventSource('/live');
es.onmessage = e => updateWheel(JSON.parse(e.data));

This pattern excels for broadcast scenarios—news feeds, dashboards, live sports.

Language Interoperability: Java and Scala Working Together

Ancelin emphasizes Play 2’s dual-language support. Java developers use the same APIs via wrappers:

public static WebSocket<JsonNode> socket() {
    return WebSocket.withActor(GameActor::props);
}

This lowers the barrier for Java teams adopting reactive patterns.

Architecture Analysis: Scalability, Fault Tolerance, and Deployment

The system scales horizontally using Akka clustering. Game instances partition by room; a load balancer routes WebSocket upgrades. Failure recovery leverages supervisor strategies.

Deployment uses Play’s dist task to generate start scripts. For production, Ancelin recommends Typesafe ConductR or Docker with health checks.

Implications for Modern Web Applications

Play 2’s real-time stack enables:
Low-latency UX without polling
Efficient resource use via non-blocking I/O
Graceful degradation under load
Cross-language development in polyglot teams

From games to trading platforms, the patterns apply broadly.

Links:

PostHeaderIcon [DevoxxFR2013] Between HPC and Big Data: A Case Study on Counterparty Risk Simulation

Lecturers

Jonathan Lellouche is a specialist in financial risk modeling, focusing on computational challenges at the intersection of high-performance computing and large-scale data processing.

Adrien Tay Pamart contributes to quantitative finance, developing simulations that balance accuracy with efficiency in volatile markets.

Abstract

Jonathan Lellouche and Adrien Tay Pamart examine counterparty risk simulation in market finance, where modeling asset variability demands intensive computation and massive data aggregation. They dissect a hybrid architecture blending HPC for Monte Carlo paths with big data tools for transactional updates and what-if analyses. Through volatility modeling, scenario generation, and incremental processing, they demonstrate achieving real-time insights amid petabyte-scale inputs. The study evaluates trade-offs in precision, latency, and cost, offering methodologies for similar domains requiring both computational depth and data agility.

Counterparty Risk Fundamentals: Temporal and Probabilistic Dimensions

Lellouche introduces counterparty risk as the potential loss from a trading partner’s default, amplified by market fluctuations. Simulation necessitates modeling time—forward projections of asset prices—and uncertainty via stochastic processes. Traditional approaches like Black-Scholes assume log-normal distributions, but real markets exhibit fat tails, requiring advanced techniques like Heston models for volatility smiles.

The computational burden arises from Monte Carlo methods: generating thousands of paths per instrument, each path a sequence of simulated prices. Pamart explains path dependence in instruments like barriers, where historical values influence payoffs, escalating memory and CPU demands.

Architectural Hybrid: Fusing HPC with Big Data Pipelines

The system partitions workloads: HPC clusters (CPU/GPU) compute raw scenarios; big data frameworks (Hadoop/Spark) aggregate and query. Lellouche details GPU acceleration for path generation, leveraging CUDA/OpenCL for parallel stochastic differential equations:

def simulate_paths(S0, r, sigma, T, steps, paths):
    dt = T / steps
    dW = np.random.normal(0, np.sqrt(dt), (paths, steps))
    S = S0 * np.exp(np.cumsum((r - 0.5 * sigma**2) * dt + sigma * dW, axis=1))
    return S

Big data handles post-processing: MapReduce jobs compute exposures, aggregating across scenarios for expected positive exposure (EPE).

Incremental Processing and What-If Analysis: Efficiency in Volatility

Batch recomputation proves untenable for intraday updates. Pamart introduces incremental techniques: delta updates recompute only affected paths on market shifts. What-if simulations—hypothetical trades—leverage precomputed scenarios, overlaying perturbations.

This demands transactional big data stores like HBase for rapid inserts/queries. The duo analyzes latency: sub-second for deltas versus hours for full runs.

Volatility Modeling: From Simple Diffusions to Complex Stochastics

Basic Brownian motion suffices for equities but falters in options. Lellouche advocates local volatility models, calibrating to implied surfaces for accurate pricing. Calibration involves solving inverse problems, often via finite differences accelerated on GPUs.

Pamart warns of model risk: underestimating tails leads to underestimated exposures. Hybrid models blending stochastic volatility with jumps capture crises better.

Cost and Scalability Trade-offs: Cloud vs. On-Premises

On-premises clusters offer control but fixed costs; cloud bursts for peaks. Fonrose-like spot instances (though not directly cited) could slash expenses for non-urgent simulations. The lecturers evaluate AWS EMR for MapReduce, GPU instances for paths.

Implications: hybrid clouds optimize, but data gravity—transferring terabytes—incurs latency and fees.

Future Directions: AI Integration and Regulatory Compliance

Emerging regulations (Basel III) mandate finer-grained simulations, amplifying data volumes. Lellouche speculates on ML for path reduction or anomaly detection.

The case underscores HPC-big data synergy: computation generates insights; data platforms deliver them actionably.

Links:

PostHeaderIcon [DevoxxFR2013] Play Framework vs. Grails Smackdown: A Head-to-Head Comparison Through Real-World Development

Lecturers

James Ward is a professional software developer since 1997, currently at AWS, dedicated to helping teams build effective applications. With experience across mountains of code and literal peaks, he shares discoveries through presentations, blogs, and demos. Previously a Technical Evangelist at Adobe and Typesafe, he focuses on scalable, reactive systems.

Matt Raible is a Java Champion and Developer Advocate, renowned for building web apps since the web’s dawn. Founder of AppFuse, author of Spring Live, and committer on Apache Roller and Struts, he has spoken globally on open-source adoption. His roles span UI Architect at LinkedIn, Evite.com, Time Warner Cable, and Oracle, emphasizing practical, performant solutions.

Abstract

James Ward and Matt Raible pit Play Framework against Grails in a comparative showdown, building identical “UberTracks” apps to evaluate productivity, performance, and ecosystem. Through lines of code, deployment simplicity, and scalability benchmarks, they dissect strengths: Grails’ convention-driven ORM versus Play’s reactive, stateless design. The session weighs server-side rendering, JavaScript integration, and backward compatibility, offering developers empirical guidance for framework choice in modern JVM web development.

Origins of the Smackdown: Hype Meets Hands-On Evaluation

Ward and Raible conceived this comparison amid hype surrounding full-stack JVM frameworks. Grails, leveraging Groovy’s conciseness, had dominated rapid development; Play, with Scala/Java options, promised superior performance and reactivity.

They built UberTracks—a music tracking app with user registration, OAuth, search, and social features—to benchmark real-world scenarios. This neutral app avoids framework biases, focusing on core web tasks.

Productivity Showdown: Lines of Code and Development Velocity

Grails shines in CRUD generation via scaffolding, minimizing boilerplate. Raible’s version clocked fewer lines overall, thanks to Groovy’s syntax sugar.

Play demands explicit controllers and views but excels in hot-reloading and type safety. Ward’s iteration emphasized Play’s compile-time checks, reducing runtime errors.

Both support rapid prototyping: Grails with grails generate-all, Play with activator new. Deployment to Heroku proved seamless for both, though Play’s statelessness eased scaling.

Performance and Scalability: Throughput Under Load

Benchmarks favored Play: higher requests/second in JMeter tests, lower memory footprint. Grails’ Hibernate sessions introduce state, complicating clustering; Play’s Akka integration enables reactive, non-blocking I/O.

Raible noted Grails’ plugin ecosystem for caching (Ehcache) mitigates issues, but Play’s built-in async support provides edge in high-concurrency scenarios.

Ecosystem and Community: Plugins, Documentation, and Job Market

Grails boasts 900+ plugins, covering security (Spring Security), search (ElasticSearch), and social (Spring Social). Documentation is exemplary, with books and tutorials abound.

Play’s ecosystem grows rapidly, emphasizing modularity via Typesafe Stack. Ward highlighted reactive manifesto alignment, fostering microservices.

Job trends show Grails’ maturity but Play’s ascent, particularly in startups.

JavaScript Integration and Modern Web Patterns

Both frameworks accommodate AngularJS or Backbone for SPAs. Raible’s Grails app used Grails-Asset-Pipeline for minification; Ward’s Play version leveraged WebJars.

Server-side templating—GSP in Grails, Twirl in Play—handles SEO-friendly rendering. Play’s JSON APIs pair naturally with client-side MV*.

Backward Compatibility and Maintenance Realities

AppFuse’s history informed this: Grails maintains smooth upgrades; Play 2.0 broke from 1.x, but migration guides exist. Raible praised Grails’ semantic versioning; Ward noted Play’s evolution prioritizes performance.

Conclusion: Contextual Winners in a Diverse Landscape

Raible and Ward conclude neither dominates universally. Grails suits data-heavy enterprise apps; Play excels in reactive, scalable services. Developers should prototype both, weighing team skills and requirements. The smackdown underscores JVM’s web strength, with both frameworks advancing the field.

Links:

PostHeaderIcon [DevoxxFR2013] Invokedynamic in 45 Minutes: Unlocking Dynamic Language Performance on the JVM

Lecturer

Charles Nutter has spent over a decade as a Java developer and more than six years leading the JRuby project at Red Hat. Co-lead of JRuby, he works to fuse Ruby’s elegance with the JVM’s power while contributing to other JVM languages and educating the community on advanced virtual machine capabilities. A proponent of open standards, he aims to keep the JVM the premier managed runtime through innovations like invokedynamic.

Abstract

Charles Nutter demystifies invokedynamic, the JVM bytecode instruction introduced in Java 7 to optimize dynamic language implementation. He explains its mechanics—bootstrap methods, call sites, and method handles—through progressive examples, culminating in a toy language interpreter. The presentation contrasts invokedynamic with traditional invokevirtual and invokeinterface, benchmarks performance, and illustrates how it enables JRuby and other languages to approach native Java speeds, paving the way for polyglot JVM ecosystems.

The Problem with Traditional Invocation: Static Assumptions in a Dynamic World

Nutter begins with the JVM’s historical bias toward statically-typed languages. The four classic invocation instructions—invokevirtual, invokeinterface, invokestatic, and invokespecial—assume method resolution at class loading or compile time. For dynamic languages like Ruby, Python, or JavaScript, method existence and signatures are determined at runtime, forcing expensive runtime checks or megamorphic call sites.

JRuby, prior to invokedynamic, relied on reflection or generated bytecodes per call, incurring significant overhead. Even interface-based dispatch suffered from inline cache pollution when multiple implementations competed.

Invokedynamic Mechanics: Bootstrap, Call Sites, and Method Handles

Introduced via JSR-292, invokedynamic defers method linking to a user-defined bootstrap method (BSM). The JVM invokes the BSM once per call site, passing a CallSite object, method name, and type. The BSM returns a MethodHandle—a typed, direct reference to a method—installed into the call site.

Nutter demonstrates a simple BSM:

public static CallSite bootstrap(MethodHandles.Lookup lookup, String name, MethodType type) {
    MethodHandle mh = lookup.findStatic(MyClass.class, "target", type);
    return new ConstantCallSite(mh);
}

The resulting invokedynamic instruction executes the linked handle directly, bypassing vtable lookups.

Call Site Types and Guarded Invocations

Call sites come in three flavors: ConstantCallSite for immutable linkages, MutableCallSite for dynamic retargeting, and VolatileCallSite for atomic updates. Guarded invocations combine a test (guard) with a target handle:

MethodHandle guard = lookup.findStatic(Guards.class, "isString", MethodType.methodType(boolean.class, Object.class));
MethodHandle target = lookup.findStatic(Handlers.class, "handleString", type);
MethodHandle fallback = lookup.findStatic(Handlers.class, "handleOther", type);
MethodHandle guarded = MethodHandles.guardWithTest(guard, target, fallback);

The JVM inlines the guard, falling back only on failure, enabling polymorphic inline caches.

Building a Toy Language: From Parser to Execution

Nutter constructs a minimal scripting language with arithmetic and print statements. The parser generates invokedynamic instructions with a shared BSM. The BSM resolves operators (+, -, *) to overloaded Java methods based on argument types, caching results per call site.

Execution flows through method handles, achieving near-Java performance. He extends the example to support runtime method missing, emulating Ruby’s method_missing.

Performance Analysis: Benchmarking Invocation Strategies

Nutter presents JMH benchmarks comparing invocation types. invokestatic serves as baseline; invokevirtual adds vtable dispatch; invokeinterface incurs interface check. invokedynamic with ConstantCallSite matches invokestatic, while MutableCallSite aligns with invokevirtual.

Key insight: the JVM’s optimizer treats stable invokedynamic sites as monomorphic, inlining aggressively. JRuby leverages this for core methods, reducing dispatch overhead by 10-100x.

Implications for JVM Languages and Future Evolution

Invokedynamic enables true polyglot JVMs. Nashorn (JavaScript), Dynalink, and Truffle frameworks build upon it. Future enhancements include value types and specialized generics, further reducing boxing.

Nutter concludes that invokedynamic fulfills John Rose’s vision: dynamic dispatch no slower than static, ensuring the JVM’s longevity as a universal runtime.

Links: