MLIRSharp
记录一下MLIRSharp的开发总结.
python bindings
python bindings提供了一个很好的参考例子, 通过tb-gen,
mlir可以生成一个dialect的operations所对应的capi/py文件(前提是这个operations必须是通过ods来定义的),
以及pass的capi.
1. context管理
python中对于mlircontext做了复杂的包装, 需要详细的了解我才能知道在csharp中应该如何实现.
这里是PyMlirContext的类定义: class PyMlirContext {
public:
  PyMlirContext() = delete;
  PyMlirContext(const PyMlirContext &) = delete;
  PyMlirContext(PyMlirContext &&) = delete;
  /// For the case of a python __init__ (py::init) method, pybind11 is quite
  /// strict about needing to return a pointer that is not yet associated to
  /// an py::object. Since the forContext() method acts like a pool, possibly
  /// returning a recycled context, it does not satisfy this need. The usual
  /// way in python to accomplish such a thing is to override __new__, but
  /// that is also not supported by pybind11. Instead, we use this entry
  /// point which always constructs a fresh context (which cannot alias an
  /// existing one because it is fresh).
  static PyMlirContext *createNewContextForInit();
  /// Returns a context reference for the singleton PyMlirContext wrapper for
  /// the given context.
  static PyMlirContextRef forContext(MlirContext context);
  ~PyMlirContext();
  /// Accesses the underlying MlirContext.
  MlirContext get() { return context; }
  /// Gets a strong reference to this context, which will ensure it is kept
  /// alive for the life of the reference.
  PyMlirContextRef getRef() {
    return PyMlirContextRef(this, pybind11::cast(this));
  }
  /// Gets a capsule wrapping the void* within the MlirContext.
  pybind11::object getCapsule();
  /// Creates a PyMlirContext from the MlirContext wrapped by a capsule.
  /// Note that PyMlirContext instances are uniqued, so the returned object
  /// may be a pre-existing object. Ownership of the underlying MlirContext
  /// is taken by calling this function.
  static pybind11::object createFromCapsule(pybind11::object capsule);
  /// Gets the count of live context objects. Used for testing.
  static size_t getLiveCount();
  /// Gets the count of live operations associated with this context.
  /// Used for testing.
  size_t getLiveOperationCount();
  /// Clears the live operations map, returning the number of entries which were
  /// invalidated. To be used as a safety mechanism so that API end-users can't
  /// corrupt by holding references they shouldn't have accessed in the first
  /// place.
  size_t clearLiveOperations();
  /// Gets the count of live modules associated with this context.
  /// Used for testing.
  size_t getLiveModuleCount();
  /// Enter and exit the context manager.
  pybind11::object contextEnter();
  void contextExit(const pybind11::object &excType,
                   const pybind11::object &excVal,
                   const pybind11::object &excTb);
  /// Attaches a Python callback as a diagnostic handler, returning a
  /// registration object (internally a PyDiagnosticHandler).
  pybind11::object attachDiagnosticHandler(pybind11::object callback);
  /// Controls whether error diagnostics should be propagated to diagnostic
  /// handlers, instead of being captured by `ErrorCapture`.
  void setEmitErrorDiagnostics(bool value) { emitErrorDiagnostics = value; }
  struct ErrorCapture;
private:
  PyMlirContext(MlirContext context);
  // Interns the mapping of live MlirContext::ptr to PyMlirContext instances,
  // preserving the relationship that an MlirContext maps to a single
  // PyMlirContext wrapper. This could be replaced in the future with an
  // extension mechanism on the MlirContext for stashing user pointers.
  // Note that this holds a handle, which does not imply ownership.
  // Mappings will be removed when the context is destructed.
  using LiveContextMap = llvm::DenseMap<void *, PyMlirContext *>;
  static LiveContextMap &getLiveContexts();
  // Interns all live modules associated with this context. Modules tracked
  // in this map are valid. When a module is invalidated, it is removed
  // from this map, and while it still exists as an instance, any
  // attempt to access it will raise an error.
  using LiveModuleMap =
      llvm::DenseMap<const void *, std::pair<pybind11::handle, PyModule *>>;
  LiveModuleMap liveModules;
  // Interns all live operations associated with this context. Operations
  // tracked in this map are valid. When an operation is invalidated, it is
  // removed from this map, and while it still exists as an instance, any
  // attempt to access it will raise an error.
  using LiveOperationMap =
      llvm::DenseMap<void *, std::pair<pybind11::handle, PyOperation *>>;
  LiveOperationMap liveOperations;
  bool emitErrorDiagnostics = false;
  MlirContext context;
  friend class PyModule;
  friend class PyOperation;
};
接下来根据几个在python中实际的场景进行讲解.
1.1 构造context
这里其实是给用户两个选择来管理context,
也可以通过with的方式来管理context.
总的来说就是只要构造了一个context就必然会把底层alloc出来的context指针存放到LiveModuleMap中.
当使用with语法的时候,
会调用__enter__函数将当前的context作为current.
ctx1 = Context()
try:
    curr = Context.current
except:
    print("no current")
    print("live:", Context._get_live_count())
with Context() as ctx2:
  print(Context.current)
  print("live:", Context._get_live_count())
TEST: testInsertionPointEnterExit  | 
1.2 location/insert point
ctx1 = Context()  | 
这里的location/以及insert point同样调用的是默认的push方式,
维护了一个存储context/insertionPoint/location的栈, 如果在新的scope中,
context不变的话可以 void PyThreadContextEntry::push(FrameKind frameKind, py::object context,
                                py::object insertionPoint,
                                py::object location) {
  auto &stack = getStack();
  stack.emplace_back(frameKind, std::move(context), std::move(insertionPoint),
                     std::move(location));
  // If the new stack has more than one entry and the context of the new top
  // entry matches the previous, copy the insertionPoint and location from the
  // previous entry if missing from the new top entry.
  if (stack.size() > 1) {
    auto &prev = *(stack.rbegin() + 1);
    auto ¤t = stack.back();
    if (current.context.is(prev.context)) {
      // Default non-context objects from the previous entry.
      if (!current.insertionPoint)
        current.insertionPoint = prev.insertionPoint;
      if (!current.location)
        current.location = prev.location;
    }
  }
}
1.3 block/operation管理
这里应该算是最复杂的部分, 主要问题就是在mlir的内部实现中, operation以及记录了他内部的block等信息, 但是在other language binding的时候, 是不知道内部已经记录了这些内容. 比如如果我使用safe handle去实现operation/block的时候, 此时operation可以get block, 拿到他内部的block, 但是如果直接通过调用extern的函数来获得这个block, 对于charp来说他会是一个新的block对象. 也就是如果使用charp的block去构造operation, 再通过operation去获取block, 拿到的block并不是原来block对象.
我觉得可以有两个解决方法:
- 在csharp的类中再维护一套引用的关系, 保证优先返回csharp中已经构造好的对象.
 - 在构造csharp类的时候区分好是新创建还是引用创建, 然后为csharp的对象重写比较, 只需要handle相同就是相等的. 但是这样每次都重新创建了csharp的类不知道会不会有隐藏的问题.
 
先看看python binding的数据关系处理: 
1.4 python wrapper
在python中,对于context又做了一层包装, 也就是每个context构造的时候都直接加载所有的dialect了.
def _site_initialize():  | 
2. attributes构造
官方在pybind11中是通过CTRP的方式构造了一个注册attribute的基类, 然后给每个类继承CTRP之后再注册. 但是在csharp binding中缺少了中间的c++ wrapper, 所以必须要直接在csharp中实现.
观察了一下所有的attribute都有mlirAttributeIsAXXX,以及mlirXXXAttrGet的构造函数,
同时部分还具有mlirXXXAttrGetTypeID的方法. 看了一下python
binding的实现, 基本上是通过在Globals中记录了type id与activator的映射,
然后在python的某些机制中自动调用down cast方法实现转换.
不过我比较疑惑的是attribute怎么没有destroy的接口, 那种数组的属性所使用的内存何时释放是个问题.
/// Checks whether the given attribute is a bool attribute.  | 
# TODO: Auto-generated. Audit and fix.  |