Problem:
- You have an app.
- It has state.
- What is my app doing?
- How do you inspect for that state?
- Simple in-memory stats get hard to expose in multi-process environments.
Solution #1: Logging
Pros
- Universally supported
- Easily enhanced
- Persistent
Cons
- Libraries suck
- Operational burden (rotating, shipping, routing, etc)
- Records events more than inspects state
- Difficult to predict where needed
- Performance impact if too verbose
Solution #2: Graphite
Pros
- Fast, sexy, “enterprise”
- Python
Cons
- Do you want a graph or a graph?
- Have fun installing it.
- Still not great for introspection.
Solution #3: socketconsole
Pros:
Pure Python.
Very useful for deadlocks, blocking code, threaded app.
Simple to integrate:
import socketconsole socketconsole.launch()Cons:
- CPython only.
- Doesn’t work with gevent or eventlet monkeypatching.
- Doesn’t work with greenthreads.
- Limited functionality.
- All the fun of Python threads.
Solution #4: REPL Backdoors
Pros:
- Pure Python!
- Changing code at runtime is for winners.
- Inspect all the things!
Cons:
- With great power comes great responsibility.
- Requires threads or event loop.
- Still can’t reach all state.
Solution #5: GDB
Pros:
- Well, you wanted introspection.
- Has some Python helpers: pygdb, gdb-heap
Cons:
- Seriously? Definitily only a a last resort.
Solution #6: JMX
Pros:
- Universally supported.
- Powerful, extensible, 2-way.
- Helpful tools.
Cons:
- Where “universal” means “runs on the JVM”.
- Not as easy to monitor as you’d think.
Note
Python needs this.
mmstats Goals
- Simple API to expose state.
- Separate publishing from reading, aggregating, etc.
- Language, platform, framework agnostic.
- Minimal & predictable performance impact.
- Optional persistence (e.g. post-mortems)
- 1-way (for now?)
What is mmstats?
mmap alows sharing memory between processes.
Language independent data structure:
- Series of fields (structs)
- Fields have label, type, and values.
Exposed in Python app as a model class.
Performance Implementation
Single writer, multi-reader
- No locks.
- No syscalls (write, read, send, recv, etc).
- All in userspace for readers & writers.
- Reading has no impact on writers.
- Fixed field sizes.
Consistency without Locks
- Q: How are reads consistent without locks?
- A: Double buffering.
Things it does:
- Query count.
- Hostname of the machine.
- Time delta of request started til response started.
- Memory delta.
- Load delta.
- Etc.
How it does it:
- Database backend.
- Uses celery.
- A couple system calls for the deltas.
Who should use it:
- Any Django site that does anything interesting (sites that do a decent amount of traffic, several hits per minute)