Welcome Guest ( Log In | Register )

Outline · [ Standard ] · Linear+

 Full DeepSeek R1 At Home 🥳🥳🥳

views
     
hellothere131495
post Jan 28 2025, 02:26 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(kingkingyyk @ Jan 28 2025, 02:19 PM)
Note that the open source model you can run the distilled version, not the full one they hosted in cloud. You need crazy amount of fast GPUs to make the full version workable. Have fun enjoy heating you room.  biggrin.gif
*
Bruh. The full model is also open source:
https://github.com/deepseek-ai/DeepSeek-R1

ollama too:
https://ollama.com/library/deepseek-r1

Edit:
lol you edited

This post has been edited by hellothere131495: Jan 28 2025, 02:27 PM
hellothere131495
post Jan 28 2025, 02:30 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(azarimy @ Jan 28 2025, 02:23 PM)
What does it do? What's the difference between running the one online?
*
Same. The same model. Just that you probably won't have the chance to run the online one. It's full 32 bit and 671B parameters. What you can run is the distilled (and quantized) version, like the qwen and llama3.1 8b one that has been distilled to respond like the original deepseekr1
hellothere131495
post Jan 28 2025, 02:32 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(jmas @ Jan 28 2025, 02:30 PM)
running the 1.5bn q4_k_m locally on my NAS
slow af, but it worked so I dont have to work with o1 free-limit
*
1.5b 4bit is slow? what spec are you using? btw, you probably still need the o1. 1.5b 4 bit is a shit model that hallucinates a lot.
hellothere131495
post Jan 28 2025, 02:33 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(kurtkob78 @ Jan 28 2025, 02:31 PM)
16GB VRAM sufficient to run o1 level intelligence ?
*
ah lol. 16GB vram is a kid toy that can only run 4-bit quantized small models.

o1 is a big model. you probably need around 400GB vram to run it (in 4 bit probably). To run the full 32 bit idk need how much. lazy to calculate.
hellothere131495
post Jan 28 2025, 02:35 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(JonSpark @ Jan 28 2025, 02:31 PM)
ok.....so world peace achieved?
*
No. a war has just started? see the news? china deepseek is getting attacked, probably by US
hellothere131495
post Jan 28 2025, 07:51 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(Penamer @ Jan 28 2025, 05:27 PM)
Just wait for the China version for AI clustering. Sure cheap like cabbage prices.
*
i'm genuinely hope for it. who the heck can afford nvidia h100, and thousands of them again. ranting.gif
hellothere131495
post Jan 28 2025, 07:55 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(ycs @ Jan 28 2025, 07:06 PM)
i got this in 8b

I am sorry, I cannot answer that question. I am an AI assistant designed to provide helpful and harmless responses.
*
you probably asking something unethical. if you wish to have the response, you need to fine tune the model to be able to answer unethical things, or just simply download a jailbreak version .gguf file from somewhere else.
hellothere131495
post Jan 28 2025, 09:05 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(KitZhai @ Jan 28 2025, 08:57 PM)
So U guys mean with different 8b, 16b, 32b or 64b all are giving different responses??
*
of course. the smaller the model, the more it's prone to making mistake, hallucinate, and not following instructions properly
hellothere131495
post Jan 28 2025, 10:50 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(KitZhai @ Jan 28 2025, 09:42 PM)
Then to make a comparison with chatgpt, what model would be appropriate to compete?
*
the 671b, full bit unquantized model.
hellothere131495
post Jan 29 2025, 04:59 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(KitZhai @ Jan 29 2025, 01:11 AM)
Its free right? Then what point of using the smaller version compared to 671b?
*
Yes it's free and open source. The point of using smaller version is because 671b needs a very powerful machine to run it. You need a football field size data center, lots of Nvidia H100 GPUs cluster to power the model. It's impossible for normal people to run it, even you buy the best PC you still can't run it.

So, for us, we can only run the smaller models like 8b, 14b, or 32b.
hellothere131495
post Jan 29 2025, 05:03 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(KitZhai @ Jan 29 2025, 09:40 AM)
But with full bit, what is the performance compare to chatgpt? Same or even higher than chatgpt?
*
comparable to chatgpt o1. Even slightly better than chatgpt a bit.
hellothere131495
post Jan 29 2025, 05:11 PM

Casual
***
Junior Member
473 posts

Joined: Sep 2019
QUOTE(KitZhai @ Jan 29 2025, 02:10 PM)
Damn my GPU old let ... Rtx 3060 super I think... Can do what? Just subscribe chatgpt?
*
If you're an AI user, then just use the chatgpt or the deepseek website. The downloaded 4 bit smaller models ain't going to perform better than the chatgpt website or deepseek website, which are running full bit model, with giant parameters.

Those 4 bit models are meant for production use like when you want to build a chatbot and sell to company. You won't ask the company to subscribe to chatgpt right? You need to provide them a full standalone working chatbot.

Or, when you are working with sensitive data, then you need to download open source models and use it privately.

Or, you are AI specialist. You do research in AI. You download the models to study them.

Or, you are simply just a hobbyist or enthusiast in Natural Language Processing and just want to have fun with the state of the art technical things in AI.

 

Change to:
| Lo-Fi Version
0.0208sec    1.21    6 queries    GZIP Disabled
Time is now: 19th December 2025 - 06:18 AM