Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Not impressive compare to the opensource video models out there, I anticipated some physics/VR capabilities, but it's basically just a marketing promotion to "stay in the game"...


I... can you explain, or point to some competitors...? To me this looks leagues ahead of everything else. But maybe I'm behind the game?

AFAIK based on HuggingFace trending[1], the competitors are:

- bytedance/animatediff-lightning: https://arxiv.org/pdf/2403.12706 (2.7M downloads in the past 30d, released in March)

- genmo/mochi-1-preview: https://github-production-user-asset-6210df.s3.amazonaws.com... (21k downloads, released in October)

- thudm/cogvideox-5b: https://huggingface.co/THUDM/CogVideoX-5b (128k downloads, released in August)

Is there a better place to go? I'm very much not plugged into this part of LLMs, partially because it's just so damn spooky...

EDIT: I now see the reply above referencing Hunyuan, which I didn't even know was its own model. Fair enough! I guess, like always, we'll just need to wait for release so people can run their own human-preference tests to definitively say which is better. Hunyuan does indeed seem good


What's the best open source video model right now?


Hunyan (https://replicate.com/tencent/hunyuan-video , $0.70/video) is the best but somewhat expensive. LTX (https://replicate.com/fofr/ltx-video , $0.10) is cheaper/faster but less capable.

Both are permissively licensed.


Hunyuan at other providers like fal.ai is cheaper than SORA for the same resolution (720p 5 seconds gets you ~15 videos for $20 vs almost 50 videos at fal). It is slower than SORA (~3 minutes for a 720p video) but faster than replicate's hunyuan (by 6-7x for the same settings).

https://fal.ai/models/fal-ai/hunyuan-video


Hunyuan is a recent one that has looked pretty good.


Like with music generation models, the main thing that might make "open source" models better is most likely that they have no concern about excluding copyrighted material from the training data, so they actually get a good starting point instead of using a dataset consisting of youtube videos and stock footage




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: