The Reference Member Trap¶
When a constructor parameter outlives its welcome.
Most QuantLib classes store shared_ptr members by value. A few store them by reference. The distinction is invisible at the API level but creates a use-after-free bug when the class is constructed through pybind11.
The Setup¶
FdmCEVOp is a finite-difference operator for the Constant Elasticity of Variance model. Its constructor takes a yield term structure:
FdmCEVOp(const ext::shared_ptr<FdmMesher>& mesher,
const ext::shared_ptr<YieldTermStructure>& rTS,
Real f0, Real alpha, Real beta,
Size direction);
Nothing unusual. Every FDM operator constructor looks like this. The binding follows the standard pattern:
py::class_<FdmCEVOp, FdmLinearOpComposite, ext::shared_ptr<FdmCEVOp>>(
m, "FdmCEVOp", "CEV FDM operator.")
.def(py::init<const ext::shared_ptr<FdmMesher>&,
const ext::shared_ptr<YieldTermStructure>&,
Real, Real, Real, Size>(),
py::arg("mesher"), py::arg("rTS"),
py::arg("f0"), py::arg("alpha"), py::arg("beta"),
py::arg("direction"));
This compiles. It imports. It constructs the object without error. It crashes later, when setTime() or apply() tries to dereference rTS_.
The Symptom¶
op = ql.FdmCEVOp(mesher, flat_curve, 100.0, 0.3, 0.5, 0)
op.setStep(0.01)
op.setTime(0.0, 1.0) # Access violation or garbage values
The crash is not deterministic. Sometimes it works. Sometimes it reads garbage. Sometimes it segfaults. The classic signature of a dangling reference.
The Cause¶
Look at the private members:
class FdmCEVOp : public FdmLinearOpComposite {
// ...
private:
const ext::shared_ptr<YieldTermStructure>& rTS_; // Reference!
Size direction_;
TripleBandLinearOp dxxMap_;
TripleBandLinearOp mapT_;
};
rTS_ is not a shared_ptr<YieldTermStructure>. It is a const shared_ptr<YieldTermStructure>& – a reference to a shared_ptr. The constructor initializes it from its parameter:
FdmCEVOp::FdmCEVOp(
const ext::shared_ptr<FdmMesher>& mesher,
const ext::shared_ptr<YieldTermStructure>& rTS, // Parameter
...)
: rTS_(rTS), // Binds reference to parameter
...
In pure C++ this is fine. The caller owns the shared_ptr and keeps it alive for the lifetime of the FdmCEVOp. The reference avoids an extra copy and a reference count increment. It is a micro-optimization.
In pybind11, it is a time bomb:
Python calls
op = ql.FdmCEVOp(mesher, flat_curve, ...).pybind11 converts
flat_curveinto a temporaryshared_ptr<YTS>and passes it to the constructor.The constructor runs.
rTS_binds to the temporary.The constructor returns. The temporary
shared_ptris destroyed.rTS_now references deallocated memory.A later call like
op.setTime(0.0, 1.0)reads through the dangling reference – undefined behavior.
pybind11 converts the Python flat_curve object into a shared_ptr<YieldTermStructure> and passes it to the constructor. The constructor stores a reference to this temporary. The temporary is destroyed when the constructor returns. rTS_ survives, pointing at nothing.
Why It Is Hard to Spot¶
The bug has three properties that make it insidious:
It compiles without warnings. The code is valid C++. The reference binds to a valid object during construction. No compiler or static analyzer flags it, because the problem is about object lifetimes across language boundaries, not about C++ semantics.
It looks identical to safe patterns. Compare FdmCEVOp’s constructor to FdmSabrOp:
// FdmCEVOp - dangerous
FdmCEVOp(const ext::shared_ptr<FdmMesher>& mesher,
const ext::shared_ptr<YieldTermStructure>& rTS, ...);
// Member: const shared_ptr<YTS>& rTS_; ← reference
// FdmSabrOp - safe
FdmSabrOp(const ext::shared_ptr<FdmMesher>& mesher,
const ext::shared_ptr<YieldTermStructure>& rTS, ...);
// Member: const shared_ptr<YTS> rTS_; ← value
The constructor signatures are identical. Only the header’s private section reveals the difference. One stores a value. The other stores a reference. The distinction is a single & character.
It sometimes works. If the memory previously occupied by the temporary happens to still contain valid data, the operator produces correct results. The bug only manifests when the memory is reused. In small test programs, the stack frame may not be overwritten before setTime() runs. In production code with more allocations between construction and use, the crash is more likely.
The Fix¶
The goal: keep the shared_ptr alive for the lifetime of the FdmCEVOp, without modifying QuantLib source code.
The approach: heap-allocate a copy of the shared_ptr and tie its lifetime to the FdmCEVOp through a custom deleter.
.def(py::init([](const ext::shared_ptr<FdmMesher>& mesher,
const ext::shared_ptr<YieldTermStructure>& rTS,
Real f0, Real alpha, Real beta, Size direction) {
// 1. Copy the shared_ptr onto the heap
auto rTSCopy = ext::make_shared<ext::shared_ptr<YieldTermStructure>>(rTS);
// 2. Construct FdmCEVOp, passing the heap copy by reference
// rTS_ now references *rTSCopy, which lives on the heap
auto op = ext::shared_ptr<FdmCEVOp>(
new FdmCEVOp(mesher, *rTSCopy, f0, alpha, beta, direction),
// 3. Custom deleter captures rTSCopy, preventing destruction
// until the FdmCEVOp itself is destroyed
[rTSCopy](FdmCEVOp* p) { delete p; });
return op;
}), ...)
Three things work together:
make_shared<shared_ptr<YTS>>(rTS)creates a heap-allocatedshared_ptrthat is a copy ofrTS. This is ashared_ptrto ashared_ptr– two levels of indirection, intentionally.*rTSCopydereferences the outer pointer, yielding a reference to the innershared_ptr<YTS>on the heap. This reference is whatrTS_binds to. Unlike the stack temporary, this heap object persists.The custom deleter lambda captures
rTSCopy. As long as theFdmCEVOp’s owningshared_ptrexists,rTSCopyexists, and the heap-allocatedshared_ptr<YTS>it points to exists. When theFdmCEVOpis destroyed, the deleter runs,rTSCopyis released, and the yield term structure’s reference count decrements normally.
The resulting ownership chain:
The Python-side
shared_ptr<FdmCEVOp>holds theFdmCEVOpobject. Its custom deleter capturesrTSCopy.rTSCopyis ashared_ptrthat owns a heap-allocatedshared_ptr<YTS>.FdmCEVOp::rTS_references that heap-allocatedshared_ptr<YTS>.When the Python holder’s reference count reaches zero, the custom deleter runs, releasing
rTSCopy. The heap-allocatedshared_ptr<YTS>is destroyed, decrementing the yield term structure’s reference count. Everything tears down in the correct order.
Alternatives Considered¶
py::keep_alive<> is pybind11’s standard tool for preventing premature destruction. py::keep_alive<1, 3>() would tell pybind11 to prevent the third argument (rTS) from being garbage-collected while the first argument (self) exists. But keep_alive operates on the Python side. It keeps the Python object alive, which keeps the underlying shared_ptr alive. However, it does not keep the specific temporary shared_ptr that pybind11 created for the C++ constructor call alive. The reference member still dangles because it binds to the temporary, not to the Python object’s internal storage.
Modifying QuantLib to change the member from const shared_ptr<T>& to shared_ptr<T> (dropping the reference) would fix the problem at its source. This is the correct long-term fix. FdmCEVOp is the only public FDM operator in QuantLib 1.40 with this pattern. It is likely an oversight rather than a deliberate optimization – the shared_ptr copy costs a single atomic increment, and setTime() is called per time step where far heavier computation dominates.
Storing a raw shared_ptr in the binding code and passing it via std::ref was considered but rejected. rTS_ expects a reference to a shared_ptr, and the shared_ptr must outlive the object. The heap-allocation approach is the only way to guarantee this without modifying the constructor’s calling convention.
Identifying Affected Classes¶
Grep QuantLib headers for reference-to-shared_ptr members:
shared_ptr<.*>&\s+\w+_;
In QuantLib 1.40, three hits appear:
Location |
Member |
Risk |
|---|---|---|
|
|
Exposed in public constructor; must workaround |
|
|
Private inner class; |
|
|
References a member of |
Only FdmCEVOp requires the workaround. But the pattern could appear in future QuantLib releases, so any new binding should check the header’s private section before writing a direct py::init<>().
The Broader Lesson¶
C++ binding layers create an impedance mismatch around object lifetimes. C++ constructors assume their callers manage argument lifetimes. pybind11 creates temporaries that live only for the duration of the call. Most of the time this is fine – constructors copy or move their arguments into members. But a constructor that stores a reference to its argument creates a hidden contract: “keep this alive for me.” pybind11 cannot honor contracts it does not know about.
The fix is ugly. Two levels of shared_ptr, a custom deleter, a lambda capture. But it is contained to a single binding function, invisible to the Python user, and correct. The alternative – a clean binding that crashes in production – is worse.
See Also¶
fdmcevop.cpp for the complete implementation
QuantLib FdmCEVOp header for the reference member declaration