We introduce LoongX, which effectively integrates multimodal neural signals to guide image editing through novel Cross-Scale State Space (CS3) encoder and Dynamic Gated Fusion (DGF) modules.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results