Ollama - Offline Generative AI

Lowyat.NET forums

Lowyat.NET Kopitiam Garage Sales

Lowyat.NET Rules and Regulations FAQ Help Search Members

Welcome Guest ( Log In | Register )

Lowyat.NET -> Software

Bump Topic Add Reply RSS Feed

3 Pages < 1 2 3Bottom

Outline · [ Standard ] · Linear+

Ollama - Offline Generative AI, Similar to ChatGPT

views

TSxxboxx	Jun 14 2025, 12:41 AM Return to original view \| Post #41
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(c2tony @ Jun 13 2025, 10:36 PM) 8700G = 16 TOPS ryzen ai max+ 395 = 55 TOPS RTX3060 12GB = 100 TOPS Apple Mac Studio M4 Max = 38 TOPS They all can run. BTW, 55 TOPS may sound like more AI power than 38 TOPS, the way Apple handles data and optimizes usage can deliver equivalent or faster AI execution Even if your PC has 128GB of RAM, your GPU might be capped by its 24GB VRAM when loading a large AI model With Apple’s unified memory, you might comfortably run a llama4:16x17b entirely in GPU addressable space if you have 96GB of ram. The bigger the model, the more capable GPU/NPU/CPU it needs in addition to the memory bandwidth. The RTX PRO 6000 videos shows when he's running Qwen2.5 Coder 32B FP16 with the size of 61GB, even M4 Max with memory bandwidth of 500GB/second only getting 7.63 tokens/second while the RTX PRO 6000 still getting good speed with 23 tokens/second. Ryzen AI Max+ 395 he uses Qwen2.5 Coder 32B q4_k_m which is only 20GB but only getting 10.8 tokens/second. This 395 CPU is very capable but it's limited by the memory bandwidth.
Card PM	Report Top Like Quote Reply

TSxxboxx	Jun 20 2025, 11:28 PM Return to original view \| Post #42
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	Gemini now can generate image with accurate long text! I tried and there's not even 1 wrong alphabet. But the image is not as highly detailed as generated by Flux. And the text is simple without much details Flux generate a lot higher details than Gemini but it still have issue to make accurate text, and can't generate all text if it's too long. I have to generate tens of images to get to this one image that looks good and first part of text is accurate. Just for fun I try create image with added prompt for Studio Ghibli artistic styles. First is Gemini and second is Flux
Card PM	Report Top Like Quote Reply

TSxxboxx	Jul 16 2025, 11:19 PM Return to original view \| Post #43
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(c2tony @ Jul 16 2025, 09:38 PM) » Click to show Spoiler - click again to hide... « still like flux, but gemini is sufficient. it's just like still like dslr but camera phone are sufficient Yup, if need something fast and presentable, Gemini is good enough just like camera phone. I watched this video of data center with clusters of GPU, it is hundreds or thousands time more powerful than any PC's GPU. But the image created by Gemini not up to the level of Flux, actually Google make hard restriction of Gemini ability so that they can save the processing power. zcwqTkbaZ0o
Card PM	Report Top Like Quote Reply

TSxxboxx	Aug 2 2025, 01:05 PM Return to original view \| Post #44
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(c2tony @ Aug 1 2025, 10:14 PM) now you can run ollama without docker!!! I don't understand. I been running ollama without docker from the start. Or you mean WebUI? Didn't see any instructions on how to do this.
Card PM	Report Top Like Quote Reply

TSxxboxx	Aug 2 2025, 11:53 PM Return to original view \| Post #45
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(c2tony @ Aug 2 2025, 09:40 PM) right click the ollama icon near the clock, select "open ollama" or you're using linux afterall, then i have no clue Just updated ollama and I understand what you mean now. No need to use WebUI anymore. The setting is so simple, can't even see it's token per second. But looks like more faster than WebUI and more accurate too.
Card PM	Report Top Like Quote Reply

TSxxboxx	Aug 5 2025, 05:53 PM Return to original view \| Post #46
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	Qwen image sounds promising. Can do long text. Haven't see yet if can use with Comfy UI
Card PM	Report Top Like Quote Reply

TSxxboxx	Aug 6 2025, 08:51 AM Return to original view \| Post #47
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	0BHBoDABOfY QWEN3 the latest high performance LLM But the size is bigger than 32GB for quantize 8bit and bigger than 16GB for quantize 4bit. LM Studio support splitting into 2 GPUs for out of memory issue but the tokens per second is a lot slower than Apple's M chip interestingly there's a red bar that flashes for a second and the name is blurred out. Could that be the AMD AI Max+ 395? He did say he couldn't share AMD result until next week when someone ask, maybe still under embargo.
Card PM	Report Top Like Quote Reply

TSxxboxx	Sep 5 2025, 05:18 PM Return to original view \| Post #48
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	Been a while since I last play with LLM. I tried GPT-OSS:20B that came with Ollama app. It is thinking model but were able to came to conclusion at less than 2 seconds, much quicker than deepseek-r1 that think for very long for simple calculation and took almost 37 seconds. Asking this question on deepseek-r1, GPT-OSS:20B and gemma3:12b CODE Bob has three boxes in front of him - Box A, Box B and Box C. Bob does not know what is in the boxes. Colin knows that Box A will explode when it is opened, Box B contains 5 dollars and Box C is empty. Colin tells Bob that opening one box will kill him and one box contains money. Should Bob open a box? deepseek-r1 understand the story wrongly by saying Colin reveal each box content and that also led to wrong conclusion gemma3:12b doesn't show the thinking, it manage to understand the story correctly and also correctly decide the decision Bob should take GPT-OSS:20B goes PhD level of answer after thinking for 3.1 seconds
Card PM	Report Top Like Quote Reply

TSxxboxx	Sep 5 2025, 05:35 PM Return to original view \| Post #49
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(xxboxx @ May 28 2025, 12:39 PM) Ollama's vision now after update is a lot better than few months ago Using gemma3:12b there is some wrong data but a lot better than previously Latest Ollama also improved on the character recognition. All the character correctly presented. But Ollama app still can't add picture, have to use Open WebUI
Card PM	Report Top Like Quote Reply

TSxxboxx	Sep 17 2025, 01:03 AM Return to original view \| Post #50
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	The photo shows up to 128GB RAM but there's no option to configure the machine 🤷 Meanwhile I'm having fun with Google's Nano Banana. It's as good as running local generative such as Flux I add this photo And ask it to create a realistic photo based on that image. I ask it to change the background to be on rooftop and with some slight digital effect on the side of the image I ask it to change it to daytime Make explosion effect on the background similar to ~~Hollywood~~ big budget movie (it block when I add Hollywood in the prompt, I then change it to big budget) Add bold heading on the bottom, make the word appear to be made on neon light bulb: lowyat.net malaysia's largest online community It's ability to be consistent in each new photo generation is amazing. c2tony liked this post
Card PM	Report Top Like Quote Reply

TSxxboxx	Sep 28 2025, 11:13 PM Return to original view \| Post #51
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	Qwen release their image_edit_2509 recently. It supposed to be able to edit image with prompt and remain consistent. I tried the lowyat picture and ask to remove the explosion. It did well Using Nano Banana Qwen able to output very close to Nano Banana, but it didn't remove the flare on the middle of the two guys shirt as cleanly as Nano Banana I tried ask Qwen to create realistic photo from the original design This post has been edited by xxboxx: Sep 28 2025, 11:14 PM c2tony liked this post
Card PM	Report Top Like Quote Reply

TSxxboxx	Nov 21 2025, 11:57 PM Return to original view \| Post #52
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(c2tony @ Nov 21 2025, 06:47 PM) run using ollama? can't find it P/S: Comet browser from perplexity sounds promising that can change the way of browsing work This is using ComfyUI for image generation. deepseek-ocr was released yesterday and it supposed to be able to see the image and ran OCR but when I ran it, it repeating "x=0" until I close the app. Already updated to version 0.13.0, don't know what is the issue. Maybe still have some bugs
Card PM	Report Top Like Quote Reply

TSxxboxx	Nov 22 2025, 11:03 PM Return to original view \| Post #53
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(c2tony @ Nov 22 2025, 07:14 AM) Yeah, mine having error as well and i closed it go sleep lmao somehoow i think lm studio getting better and ollama getting worse with their new method, app like and cloud thingy Yeah, at first it seems better with it's app, use less resource and more faster but now it seems more worse. Also very annoying it add cloud in the choice for model.
Card PM	Report Top Like Quote Reply

TSxxboxx	Nov 28 2025, 09:13 AM Return to original view \| Post #54
The mind is for having ideas, not holding them Senior Member 5,261 posts Joined: Oct 2004 From: J@Y B33	QUOTE(c2tony @ Nov 27 2025, 10:32 PM) And FLUX 2.0 already https://huggingface.co/spaces/black-forest-labs/FLUX.2-dev The VRAM requirement is getting even more higher
Card PM	Report Top Like Quote Reply

« Next Oldest · Software · Next Newest »

3 Pages < 1 2 3Top

Add Reply Options

Change to:

0.0163sec

1.00

6 queries

GZIP Disabled
Time is now: 24th December 2025 - 11:23 AM

All Rights Reserved © 2002- 2025 Vijandren Ramadass (~unite against racism~)

Removal Request

Powered by Invision Power Board © 2025 IPS, Inc.