PDX Python Meetup

September, 2011

Delicious data with mmstats

Problem:

  • You have an app.
  • It has state.
  • What is my app doing?
  • How do you inspect for that state?
  • Simple in-memory stats get hard to expose in multi-process environments.

Solution #1: Logging


Pros

  • Universally supported
  • Easily enhanced
  • Persistent

Cons

  • Libraries suck
  • Operational burden (rotating, shipping, routing, etc)
  • Records events more than inspects state
  • Difficult to predict where needed
  • Performance impact if too verbose

Solution #2: Graphite


Pros

  • Fast, sexy, “enterprise”
  • Python

Cons

  • Do you want a graph or a graph?
  • Have fun installing it.
  • Still not great for introspection.

Solution #3: socketconsole


Pros:

  • Pure Python.

  • Very useful for deadlocks, blocking code, threaded app.

  • Simple to integrate:

    import socketconsole
    socketconsole.launch()
    

Cons:

  • CPython only.
  • Doesn’t work with gevent or eventlet monkeypatching.
  • Doesn’t work with greenthreads.
  • Limited functionality.
  • All the fun of Python threads.

Solution #4: REPL Backdoors


Pros:

  • Pure Python!
  • Changing code at runtime is for winners.
  • Inspect all the things!

Cons:

  • With great power comes great responsibility.
  • Requires threads or event loop.
  • Still can’t reach all state.

Solution #5: GDB


Pros:

  • Well, you wanted introspection.
  • Has some Python helpers: pygdb, gdb-heap

Cons:

  • Seriously? Definitily only a a last resort.

Solution #6: JMX


Pros:

  • Universally supported.
  • Powerful, extensible, 2-way.
  • Helpful tools.

Cons:

  • Where “universal” means “runs on the JVM”.
  • Not as easy to monitor as you’d think.

Note

Python needs this.

mmstats Goals


  • Simple API to expose state.
  • Separate publishing from reading, aggregating, etc.
  • Language, platform, framework agnostic.
  • Minimal & predictable performance impact.
  • Optional persistence (e.g. post-mortems)
  • 1-way (for now?)

What is mmstats?


  • mmap alows sharing memory between processes.

  • Language independent data structure:

    • Series of fields (structs)
    • Fields have label, type, and values.
  • Exposed in Python app as a model class.

Performance Implementation


Single writer, multi-reader

  • No locks.
  • No syscalls (write, read, send, recv, etc).
  • All in userspace for readers & writers.
  • Reading has no impact on writers.
  • Fixed field sizes.

Consistency without Locks


Q: How are reads consistent without locks?
A: Double buffering.

django-slow-log or How we found a view doing 100k queries

Things it does:

  • Query count.
  • Hostname of the machine.
  • Time delta of request started til response started.
  • Memory delta.
  • Load delta.
  • Etc.

How it does it:

  • Database backend.
  • Uses celery.
  • A couple system calls for the deltas.

Who should use it:

  • Any Django site that does anything interesting (sites that do a decent amount of traffic, several hits per minute)

Table Of Contents

Previous topic

Class-Based Views

Next topic

django-social-auth Setup

This Page