Welcome Guest ( Log In | Register )

Outline · [ Standard ] · Linear+

 Ollama - Offline Generative AI, Similar to ChatGPT

views
     
TSxxboxx
post Apr 26 2024, 10:10 PM, updated 4d ago

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


I searched but there doesn't seem to be any thread about Ollama and nobody discuss about it too. And I think this is the correct sub-forum to discuss about this.

Anyway, been dabbling with Ollama for almost a month ever since they release Windows installer version. At first been using it on CMD / Terminal, mistral and llama2 seems like the best models that time. And then learned about pairing Ollama with PrivateGPT for it to ingest data.

Now after a month, recently Meta release llama3 and Microsoft release phi3 model, the AI intelligence level got even better. I'm now using Open-WebUI instead of PrivateGPT as it looks to be a lot more stable and can change the model easily.

I tried giving this puzzle to the AI, if you seen Alice in Borderland you'll know the answer. Only 1 give impressively correct answer.
QUOTE
There are three on/off switches on the ground floor of a building. Only one operates a single lightbulb on the third floor. The other two switches are not connected to anything. Put the switches in any on/off order you like. Then go to the third floor to check the bulb. Without leaving the third floor, can you figure out which switch is genuine? You get only one try.


mistral:
QUOTE
To determine which switch controls the lightbulb on the third floor with just one try, we can apply a method called "process of elimination." Here's a strategy to follow:

1. Test the switches in a specific order and record the results. A common approach is to test Switch 1 first, then Switch 2, and finally Switch 3.
2. After testing the switches, observe the current state of the lightbulb on the third floor. Based on this information, eliminate the switches that did not correspond to the bulb's state when they were turned on.
3. Now, you are left with the switch(es) that could control the lightbulb. Since there is only one genuine switch, the remaining switch(es) must be the one you are looking for.

Here's an example of how this process might work:

- Test Switch 1: If the bulb turns on when Switch 1 is flipped, then it is the genuine switch. If not, proceed to the next step.
- Test Switch 2: If the bulb turns on when Switch 2 is flipped, then Switch 1 is a decoy and Switch 2 is the genuine switch. If not, proceed to the next step.
- Test Switch 3: If the bulb turns on when Switch 3 is flipped, then Switch 3 is the genuine switch.

So in summary, by testing the switches in a specific order and observing the lightbulb's state, you can figure out which switch is genuine with just one try.


llama3
QUOTE
A classic puzzle!
Let's think about it step by step.

First, let's assume that all three switches are on. We'll then go to the third floor and turn off the lightbulb. Now, we're back on the ground floor. Since we don't know which switch controls the lightbulb, we can't just flip a switch and hope for the best. Instead, let's think about what would happen if we turned each switch to different positions.

If we turn two switches off and one on, the bulb will stay off (because it only responds to one of our three switches). If we turn all three switches off, the bulb will still be off. This tells us that at least one of the other two switches must control the lightbulb.

Now, let's consider what happens if we turn two switches on and one off. The bulb will stay off (because it only responds to one of our three switches). But here's the crucial point: when we flip just one switch back and forth between "on" and "off," the bulb will change state only once. This tells us that this one switch must control the lightbulb.

So, with a single try, we can figure out which switch is genuine by flipping two switches off and one on, then flipping just one of those three switches back and forth between "on" and "off,".


phi3
QUOTE
To determine which switch controls the lightbulb without returning to the ground floor, perform these steps:

1. Turn on switch 1 and leave it on for a few minutes.
2. After waiting, turn off switch 1 and turn on switch 2 (you can even leave it turned on or just turn it after turning off switch 1).
3. Proceed to the third floor with lightbulb in sight.
4. Observe the state of the bulb:
  - If the bulb is lit, then switch 2 controls the lightbulb (because you turned it on and left it on when going upstairs).
  - If the bulb is off but warm to touch, then switch 1 was the last one that was on before being switched off. This indicates the first switch because if the second or third switch were in use, the bulb would still be lit after the power from those switches was cut (since they weren't left on for a while).
  - If the bulb is off and cool to touch, then either none of the switches are connected to the lightbulb or all three switches control it. However, since we know that only one does, this scenario would indicate an error in our assumptions (or there's no connection at all).

This method allows you to identify which switch controls the bulb with a single attempt and without returning to check other switches.


Anyone else playing around with Ollama?
iZuDeeN
post May 11 2024, 01:02 AM

Look at all my stars!!
*******
Senior Member
3,466 posts

Joined: Jan 2003
From: PJ, Malaysia
Guys someone can guide me a newbie guide to install Ollama with Llama3 model on a Windows machine?

I tried using docker on Synology ; and it went perfectly ; but took forever to generate a response

On PC i would be using it on 5600x with RTX3080 with 32GB RAM
TSxxboxx
post May 13 2024, 04:55 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(iZuDeeN @ May 11 2024, 01:02 AM)
Guys someone can guide me a newbie guide to install Ollama with Llama3 model on a Windows machine?

I tried using docker on Synology ; and it went perfectly ; but took forever to generate a response

On PC i would be using it on 5600x with RTX3080 with 32GB RAM
*
Even on my 8 cores laptop it is slow to generate response compared to my computer with GPU. NAS CPU is just not powerful enough to run AI.

Ollama can be installed easily on Windows with it's setup file. Go to ollama.com and download the setup file for Windows. After install open command prompt or power shell (might need to run as admin) and type "ollama run <insert model name>"
It will then download the model. Better to get llama3:instruct as this have been tweaked to give better answer. Now you can type anything to ask the AI. phi3: instruct also another good model to try.

If you want to have UI (user interface), install this
https://github.com/open-webui/open-webui

I recommend using the docker method. You still need to install ollama, this is just create web interface and also for RAG. After install docker, just run this on command prompt
CODE
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main


You can attach text doc or insert link and ask questions about it. With model such as llava you can also attach picture and ask about it, but this is still very infancy.

Once done, you access it on http://localhost:3000
With some configuration on your computer firewall, you can also access the web ui with any device connected to the same local network, your computer become the AI server for your household.
iZuDeeN
post May 13 2024, 09:47 PM

Look at all my stars!!
*******
Senior Member
3,466 posts

Joined: Jan 2003
From: PJ, Malaysia
QUOTE(xxboxx @ May 13 2024, 04:55 PM)
Even on my 8 cores laptop it is slow to generate response compared to my computer with GPU. NAS CPU is just not powerful enough to run AI.

Ollama can be installed easily on Windows with it's setup file. Go to ollama.com and download the setup file for Windows. After install open command prompt or power shell (might need to run as admin) and type "ollama run <insert model name>"
It will then download the model. Better to get llama3:instruct as this have been tweaked to give better answer. Now you can type anything to ask the AI. phi3: instruct also another good model to try.

If you want to have UI (user interface), install this
https://github.com/open-webui/open-webui

I recommend using the docker method. You still need to install ollama, this is just create web interface and also for RAG. After install docker, just run this on command prompt
CODE
docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main


You can attach text doc or insert link and ask questions about it. With model such as llava you can also attach picture and ask about it, but this is still very infancy.

Once done, you access it on http://localhost:3000
With some configuration on your computer firewall, you can also access the web ui with any device connected to the same local network, your computer become the AI server for your household.
*
I manage to install ; but problem is I cant download anything from huggingface repo ; seems invalid credentials or something as in docker desktop also I cant see any images.. just blank when search

I want to use llama3-uncensored model
TSxxboxx
post May 13 2024, 10:28 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(iZuDeeN @ May 13 2024, 09:47 PM)
I manage to install ; but problem is I cant download anything from huggingface repo ; seems invalid credentials or something as in docker desktop also I cant see any images.. just blank when search

I want to use llama3-uncensored model
*
You need to have account in huggingface to download. Some model need request permission and after granted then only can see the download.

Which one you want to download? Llama-3-8B-Lexi-Uncensored?
Go to files and versions, then click the download icon next to the gguf name
user posted image

Uncensored version basically only allow the AI to answer unethical or criminal questions. I never tried using gguf file with ollama, I just find model from ollama own website, and use the pull command. Usually when a huggingface model become famous enough, someone will upload it to ollama. Now I'm downloading llava-llama3, see if it can recognize better than llava
TSxxboxx
post May 13 2024, 11:26 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


llava-llama3 is impressive. I ask to describe this picture

user posted image

It answer
QUOTE
In the image, a lion-like creature is seated at a wooden desk in what appears to be a forest setting. The creature's fur is a light brown color, and it has long hair that cascades down its back. Its eyes are a striking blue, and it wears a serious expression on its face as it gazes directly into the camera.


The desk on which the creature sits is made of wood and is positioned in front of a backdrop of trees and foliage. The forest setting suggests an outdoor environment, possibly a clearing or a secluded spot within the woods.


On the desk, there's a laptop computer open to the screen. The laptop is silver in color, contrasting with the natural tones of the surrounding environment. The creature seems to be interacting with the laptop, perhaps using it for work or communication.


There are no discernible texts or other objects in the image. The relative position of the objects suggests that the lion-like creature is the main subject of this image, with the forest setting and the laptop serving as supporting elements to frame the scene. The image does not provide any information about the actions of the objects beyond what has been described.

ericcheng2021
post May 16 2024, 04:19 PM

Getting Started
**
Junior Member
182 posts

Joined: Mar 2021


Ollama now very easy to run on windows and download model.
Trying llama2 uncensored to generate some ahem stories, really good at it.
TSxxboxx
post May 17 2024, 12:43 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(ericcheng2021 @ May 16 2024, 04:19 PM)
Ollama now very easy to run on windows and download model.
Trying llama2 uncensored to generate some ahem stories, really good at it.
*
I tried llama3-chatqa, it was very fast to give answer
ericcheng2021
post May 17 2024, 07:11 PM

Getting Started
**
Junior Member
182 posts

Joined: Mar 2021


QUOTE(xxboxx @ May 17 2024, 12:43 PM)
I tried llama3-chatqa, it was very fast to give answer
*
what GPU u using ?
TSxxboxx
post May 17 2024, 09:43 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(ericcheng2021 @ May 17 2024, 07:11 PM)
what GPU u using ?
*
I'm using 3080
chow1942
post Jul 9 2024, 08:01 AM

Getting Started
**
Junior Member
280 posts

Joined: Dec 2006
i've try running using old laptop and only tinyllama model waiting time is acceptable lmao
SUSMilfuntastic
post Jul 25 2024, 05:24 AM

Real man help each other not SUS one another
****
Junior Member
559 posts

Joined: Dec 2022
From: Chyna builds kingdom instead of BS about freedom
Interesting share tq
TSxxboxx
post Aug 10 2024, 10:21 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Trying out Flux. The quality is 👍
user posted image

But need lot's of RAM. VRAM too unless run schell model that limit to 8GB VRAM use
user posted image
chow1942
post Aug 13 2024, 08:36 PM

Getting Started
**
Junior Member
280 posts

Joined: Dec 2006
Eh, flux? in ollama?
Probably not with my old hardware without graphic card anyway tongue.gif
Phi3 just updated, acceptable speed for some easy question
TSxxboxx
post Aug 15 2024, 08:46 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(chow1942 @ Aug 13 2024, 08:36 PM)
Eh, flux? in ollama?
Probably not with my old hardware without graphic card  anyway tongue.gif
Phi3 just updated, acceptable speed for some easy question
*
Of course not. Lazy to open new thread.

Phi3 recently got update? I saw update is 2 weeks ago.

Phi3 doesn't feel on par with other LLMs, especially with the latest Mistral and Llama3.1
Phi3:14b on the other hand is comparable.

1 of the usual Q to ask LLM: Sally is a girl. She has three brothers. Each of her brothers has the same two sisters. How many sisters does Sally have?

Phi3 go on rambling mode, but still get it wrong
QUOTE
If we are to assume that each brother indeed shares the same two sisters, and since it's established in the problem statement that there'deeply exists at least one girl (Sally), then logically speaking these two must be her only siblings. Therefore, aside from herself being counted among them as well, Sally does not have any other sisters besides possibly twins she shares with each of her brothers. However, this scenario seems a bit unlikely given the context and wording that suggests individuality amongst family members (as it typically would be in such problems). Thus another plausible interpretation is there's no additional girl aside from Sally—a common structure for sibling scenarios when not specified otherwise. In summary:

- She has 1 sister, herself.

Sally only counts as a sister to each of her brothers; therefore, she has zero other sisters besides possibly the twin or half-twin scenario which is commonly understood but was unnecessary here given that Sally's existence makes it clear there are no more siblings in this family setup than just herself and three male children.


While the others gets it right

phi3:14b assume it's a twin
QUOTE
Sally only has one sister, herself. Since each brother has two sisters and we know that she is one of them (as it's stated 'She', referring to Sally), then there must be just one other girl in the family - her own twin. So, they are three brothers and two girls in total.


mistral-nemo straight to the point
QUOTE
Let's think through this logically:

1. We know that Sally has three brothers.
2. Each of her brothers has two sisters.
3. Since they all share these two sisters, those must be Sally herself and another sister.

So, Sally has one sister.

Final answer: Sally has one sister.


llama3.1
QUOTE
The question states that each of Sally's brothers "has the same two sisters". This means Sally must be one of those two sisters for each brother.

So, Sally has 3 brothers, and each brother has the same two sisters, which includes Sally herself. That means there is another girl who is also a sister to all three brothers.

Therefore, Sally has only ONE other sister in addition to herself. The answer is 2.

chow1942
post Aug 16 2024, 08:14 PM

Getting Started
**
Junior Member
280 posts

Joined: Dec 2006
qwen2:latest
QUOTE
Given the riddle, "Sally is a girl. She has three brothers. Each of her brothers has the same two sisters. How many sisters does Sally have?"


The answer to this riddle is:


Sally has two sisters.


Here's why: Each of Sally's brothers has the same two sisters. Since there are three brothers, you might initially think that each sister is counted multiple times, but in reality, it means Sally and one other sister together are counted as the "same two sisters" for all her brothers. So, Sally counts once for herself and then there is another sister who is also counted alongside her. That's a total of two sisters for Sally to have.
Let me start to think all Chinese AI is fallen behind or what

deepseek-v2:latest
QUOTE
The riddle you're presenting seems like it might be intended as an optical illusion or play on words, rather than a straightforward problem with a literal interpretation. It appears to hint at something more complex about sibling relationships and how we categorize them (siblings versus half-siblings).


Given the standard understanding of siblings in most families, Sally would have one sister and three half-sisters if her parents each had another child apart from her. However, without additional context or clarification on whether Sally is also part of this sibling group or should be considered separately, it's difficult to provide a definitive answer based strictly on the information given.


If you're looking for a more straightforward interpretation and considering only siblings in the traditional sense (those with the same mother and father), then Sally would have one sister because she is not part of the group described as having two sisters each by virtue of sharing just one parent—that description seems to suggest that there are four children total, not three.


So based on this interpretation, Sally has:
1 sister (herself) + 3 half-sisters = Total of 4 siblings.


and this one is hallucinating lol

This post has been edited by chow1942: Aug 16 2024, 08:17 PM
TSxxboxx
post Aug 16 2024, 09:26 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(chow1942 @ Aug 16 2024, 08:14 PM)
qwen2:latest
Let me start to think all Chinese AI is fallen behind or what

deepseek-v2:latest
and this one is hallucinating lol
*
qwen2:latest still got it right, but assume wrongly human would think 3 brothers means each sister is counted multiple times.

deepseek-v2:latest lol seems like few generations ago
c2tony
post Aug 21 2024, 07:57 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


Phi3 updated to Phi3.5 and get it right this time

QUOTE
Since all three of Sally's brothers share the same two sisters, it implies that these are also her siblings because in a family unit with multiple children like this one (including both male and female), there is only one set of sisters for each brother. Therefore, despite having three brothers, Sally has just one sister—the fact they all have "two" sisters at common refers to the same individual who counts once per sibling relationship in a family with multiple children sharing identical pairs among themselves. So, Sally indeed only has one biological sister.
chow1942
post Aug 30 2024, 08:17 PM

Getting Started
**
Junior Member
280 posts

Joined: Dec 2006
well, I cannot get it to make the right flag, the right tower and even the right cat

user posted image

This post has been edited by chow1942: Aug 30 2024, 08:18 PM
TSxxboxx
post Sep 1 2024, 09:35 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Probably Flux doesn't generate exactly or not enough data to be accurate. Maybe if you train some loras then can use it to generate accurate flag and tower.

For fun it's good enough. This I use with anthroids lora
user posted image
TSxxboxx
post Sep 13 2024, 12:35 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


For vision AI, Qwen2-VL is very impressive!! This is the output for the Merdeka picture above:
QUOTE
This picture appears to be celebrating Hari Merdeka, which is Malaysia's Independence Day. The text "Hari Merdeka!" is prominently displayed at the top of the image. The individuals in the picture are holding Malaysian flags and are dressed in blue shirts, which is a common color associated with the Malaysian flag. The setting seems to be outdoors, possibly in a park or a similar open area, with trees visible in the background. The overall mood of the image is festive and patriotic.


And this is the text for the cat with Malaysia flag:
QUOTE
This image depicts two cats sitting on a high structure, holding up the flag of Malaysia. The background shows a cityscape with numerous skyscrapers, suggesting that the cats are at a significant height above the ground. The scene is likely digitally manipulated to create a whimsical and imaginative setting.


This handwritten note
user posted image
QUOTE
The handwritten note on the paper reads:

```
Dear Lucinda,
Just a quick note to let you know how grateful I am for the beautiful flowers you dropped off last
```


Too bad don't have model for Ollama yet
chow1942
post Sep 17 2024, 07:35 PM

Getting Started
**
Junior Member
280 posts

Joined: Dec 2006
Using Ollama minicpm-v model for merdeka pics

Can you describe this image for me please?

QUOTE
The image depicts a group of people celebrating Malaysia's Independence Day. They are standing outdoors with several Malaysian flags in the background, creating an atmosphere filled with national pride and joy. The text "Hari Merdeka!" suggests that this moment is significant for those involved as they commemorate their country’s independence day.


can you explain and describe that image in a very detail way?

» Click to show Spoiler - click again to hide... «



TSxxboxx
post Oct 24 2024, 10:32 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Ollama now support GGUF
https://www.tomsguide.com/ai/ollama-just-ma...ternet-required

I haven't tried it yet, but this opens up to much more choices
TSxxboxx
post Oct 26 2024, 07:25 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


chow1942 using minicpm-v can you get all the text from this image correctly?
user posted image

I only got as such using it
user posted image

But using one of the online server it got very close to complete and correct
user posted image

I wonder if my parameters is not correct or ollama/open-webui engine issue.

using llama 3.2 vision on one of the online server also give it correctly but then probably runs out of token
user posted image

I also tried on ChatGPT and everything is almost correct
user posted image
TSxxboxx
post Nov 8 2024, 11:23 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Llama 3.2 Vision for Ollama has been released!

Trying with the earlier table, it is better than other model but still not complete

user posted image


If I crop only the table it is better

user posted image

If I crop it to be shorter table, it's better but still same split into 2 table

user posted image
TSxxboxx
post Jan 25 2025, 12:14 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


deepseek-r1 is impressively a lot better than most other model that I tested. The chain of thought makes it able to give answer that is more correct and more on point.

The default 8b I feel just a bit better than most other models, when it can't find the answer it make up something but the 14b will not. 14b also will give answer by a topic and breakdown in it as sub-topic while 8b put everything separately as an individual topic. I can only imagine how much better the 671b would be.

I ask how many e in “defenselessness”. 8b wasn't able to give the correct answer while 14b can.
user posted image

Also interesting in 14b thought on how to check to make sure the answer is correct. It also will reference back previous answer even though I'm using different model to get that answer
» Click to show Spoiler - click again to hide... «


While 8b stuck at thinking it is 11 letters
» Click to show Spoiler - click again to hide... «


Other models can't get it right
user posted image

Even ChatGPT free can't get it right in the first try
user posted image

TSxxboxx
post Jan 25 2025, 11:01 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Asking for the numbers of letter show most of the model flaws. Gemini 1.5 also got it wrong in first try. Gemini 2.0 got it right in first try
user posted image


Even mistral-nemo can't get it right
user posted image
crazyfatt
post Apr 8 2025, 04:01 AM

Getting Started
**
Junior Member
117 posts

Joined: May 2009
From: KL,Cheras



bought rtx 5090 and wish to learn into building local ai to play. Anywhere i can get guide on doing it and which model is good now ?
chow1942
post Apr 8 2025, 07:41 AM

Getting Started
**
Junior Member
280 posts

Joined: Dec 2006
QUOTE(crazyfatt @ Apr 8 2025, 04:01 AM)
bought rtx 5090 and wish to learn into building local ai to play. Anywhere i can get guide on doing it and which model is good now ?
*
There's a few popular frameworks for running LLMs locally:

ollama - Command-line focused but easy to use
LM Studio - Very user-friendly GUI for downloading and running models
Jan.ai - Open source local AI assistant with a nice interface

For models to run locally on your RTX 5090, some good options include:
Llama - Meta's flagship model, excellent all-around performance
Mistral - Great balance of size and capability
Phi - Microsoft's smaller but efficient model
Gemma - Google's open model with good performance

This post has been edited by chow1942: Apr 8 2025, 09:54 PM
TSxxboxx
post Apr 8 2025, 11:05 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(crazyfatt @ Apr 8 2025, 04:01 AM)
bought rtx 5090 and wish to learn into building local ai to play. Anywhere i can get guide on doing it and which model is good now ?
*
For generative text, what I'm using is Ollama for server and Open WebUI for the GUI.
Download and installation follow the GitHub page
https://github.com/ollama/ollama?tab=readme-ov-file
https://github.com/open-webui/open-webui

Follow the Open WebUI instructions at the GitHub to setup and setting the web ui.

After install, go to ollama.com to choose the model that you wish to try, deepseeek-r1 is the famous now, since you got 5090, 32B model can fit into your GPU RAM. Open Command Prompt on your computer and type "ollama pull deepseek-r1:32b", after that use Open WebUI and select deepseeek-r1 model. At Open WebUI setting also there's way to download model but I feel command prompt more easier.

Some also prefer LM Studio as it's more straight forward but I never tried it before.

You should watch https://youtube.com/@azisk channel, he do a lot of generative text test.

For generative ai image, I'm using ComfyUI for the server and Flux for the model/checkpoint. There's also Automatic1111 as the server but I prefer ComfyUI as easier to do changes and apply filter or loras. Model/checkpoint is generative brain in plain word, some model good for text, some good for realistic and some for anime and many more. Go to civitai and see many different trained model for specific type of image. Some model support loras, it is to do additional changes to the model. There also inpainting, this is to mask certain part of the existing image and fill with image of your prompt. If you want to go more advance, there's controlnet. You also should install ComfyUI-Manager to manage all these extensions.

For ai image I don't have specific site for the installation walkthrough, what I done is search tutorial at YouTube and follow their steps on how to install and use those functions.
davidletterboyz
post Apr 10 2025, 11:41 AM

Der Kaiser
*******
Senior Member
4,672 posts

Joined: Jan 2003


QUOTE(crazyfatt @ Apr 8 2025, 04:01 AM)
bought rtx 5090 and wish to learn into building local ai to play. Anywhere i can get guide on doing it and which model is good now ?
*
where did you buy it and how much? Seems like the street price is way more $ than suggested price.
c2tony
post Apr 30 2025, 05:24 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


do anyone know how to close that thinking stuff on deepseek or qwen3? ollama webui of course

P/S: https://huggingface.co/jedisct1/MiMo-7B-RL-...f?download=true

can play with xiaomi ai

This post has been edited by c2tony: Apr 30 2025, 10:04 PM
TSxxboxx
post May 1 2025, 07:59 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Apr 30 2025, 05:24 PM)
do anyone know how to close that thinking stuff on deepseek or qwen3? ollama webui of course

P/S: https://huggingface.co/jedisct1/MiMo-7B-RL-...f?download=true

can play with xiaomi ai
*
I remember using deepseek, the thinking stuff is not shown unless press the arrow beside the model name

How's the Xiaomi AI compared deepseek? Better answer?

This post has been edited by xxboxx: May 1 2025, 08:00 AM
c2tony
post May 1 2025, 01:14 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ May 1 2025, 07:59 AM)
I remember using deepseek, the thinking stuff is not shown unless press the arrow beside the model name

How's the Xiaomi AI compared deepseek? Better answer?
*
yes, but i don't want it to show that arrow! it takes more times to show the steps, whether you click it or not

did not manage to try MiMo yet, i don't know how to load gguf
TSxxboxx
post May 1 2025, 04:29 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 1 2025, 01:14 PM)
yes, but i don't want it to show that arrow! it takes more times to show the steps, whether you click it or not

did not manage to try MiMo yet, i don't know how to load gguf
*
Oh you mean you don't want it to do the thinking stuff? I don't think can, those models are designed for thinking, for questions that need deep thought for answer, these kind of models are better than other models that doesn't do thinking. But if straight forward question, such as calculation then these models waste a lot of time to get the obvious answer.

Using terminal/CMD, type "ollama pull hf.co/jedisct1/MiMo-7B-RL-GGUF:Q8_0"
this will pull the Q8_0 8.1GB model

If you want the smaller 4.7GB model, type "ollama pull hf.co/jedisct1/MiMo-7B-RL-GGUF:Q4_K_M"

I tried it and the answer it given feels as good as deepseek. When feed data for it to analyze, it does take some time to process it before give the answer
c2tony
post May 1 2025, 10:11 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ May 1 2025, 04:29 PM)
Oh you mean you don't want it to do the thinking stuff? I don't think can, those models are designed for thinking, for questions that need deep thought for answer, these kind of models are better than other models that doesn't do thinking. But if straight forward question, such as calculation then these models waste a lot of time to get the obvious answer.

Using terminal/CMD, type "ollama pull hf.co/jedisct1/MiMo-7B-RL-GGUF:Q8_0"
this will pull the Q8_0 8.1GB model

If you want the smaller 4.7GB model, type "ollama pull hf.co/jedisct1/MiMo-7B-RL-GGUF:Q4_K_M"

I tried it and the answer it given feels as good as deepseek. When feed data for it to analyze, it does take some time to process it before give the answer
*
thanks for the command notworthy.gif
I tried it also

Do you familiar with the thought experiment the ship of Theseus?
In the field of identify metaphysics?
If those removed planks are restored and reassembled, free of the rot, is that the ship of Theseus?

the third question it "think" about 7min

Neither is the true ship or both are the true ship?
- it's still thinking...
TSxxboxx
post May 2 2025, 08:55 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 1 2025, 10:11 PM)
thanks for the command  notworthy.gif
I tried it also

Do you familiar with the thought experiment the ship of Theseus?
In the field of identify metaphysics?
If those removed planks are restored and reassembled, free of the rot, is that the ship of Theseus?

the third question it "think" about 7min

Neither is the true ship or both are the true ship?
- it's still thinking...
*
7 Mins thinking, it really goes into deep thought sweat.gif

I asked more simpler question, "how many “e” in benzodiazepines?"
It takes almost 2 minutes to think, and the thought process is a lot
» Click to show Spoiler - click again to hide... «


On the other hand if use smaller model such as llama3.2, it is fast but give wrong answer shakehead.gif
"In the word "benzodiazepines," there are two "e"s and also one "i" but not affecting the total count of e"

Deepseek took longer than Mimo, 2 minutes plus and give correct answer
c2tony
post May 2 2025, 02:05 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ May 2 2025, 08:55 AM)
» Click to show Spoiler - click again to hide... «
llama3.2:3b-instruct-fp16 after 2+ min answered: In the word "benzodiazepines", the letter "e" appears three times.
while smollm2:1.7b-instruct-fp16 gave me TypeError: NetworkError when attempting to fetch

XiaoMi's MiMo LLM are relative new.
Afterall they're all LLMs using the same "highway": pattern recognizing. If AI starting to understand then we might need to concern about their concise awakening laugh.gif


TSxxboxx
post May 2 2025, 10:24 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 2 2025, 02:05 PM)
llama3.2:3b-instruct-fp16  after 2+ min answered: In the word "benzodiazepines", the letter "e" appears three times.
while smollm2:1.7b-instruct-fp16 gave me TypeError: NetworkError when attempting to fetch

XiaoMi's MiMo LLM are relative new.
Afterall they're all LLMs using the same "highway": pattern recognizing. If AI starting to understand then we might need to concern about their concise awakening laugh.gif
*
I'm using llama3.2 2GB size, maybe that's why the answer is not good
c2tony
post May 26 2025, 08:17 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


Lately gemma3 update for 12b are annoying, they distribute it to my cpu & gpu, just won't run at 100% GPU anymore.
CODE

ollama ps
NAME                 ID              SIZE     PROCESSOR          UNTIL
gemma3:12b-it-qat    5d4fa005e7bb    12 GB    31%/69% CPU/GPU    4 minutes from now

TSxxboxx
post May 26 2025, 08:41 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 26 2025, 08:17 PM)
Lately gemma3 update for 12b are annoying, they distribute it to my cpu & gpu, just won't run at 100% GPU anymore.
CODE

ollama ps
NAME                 ID              SIZE     PROCESSOR          UNTIL
gemma3:12b-it-qat    5d4fa005e7bb    12 GB    31%/69% CPU/GPU    4 minutes from now

*
How many GB is your VRAM? Windows also uses some VRAM, if 12GB then not enough. When not enough VRAM then it will offload some to CPU and causes the slowdown.
c2tony
post May 26 2025, 09:48 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ May 26 2025, 08:41 PM)
How many GB is your VRAM? Windows also uses some VRAM, if 12GB then not enough. When not enough VRAM then it will offload some to CPU and causes the slowdown.
*
12GB vram, i know...
Qwen3 with 14b and 9.3GB uses my 100% GPU, for a comparison

CODE
ollama ps
NAME         ID              SIZE     PROCESSOR    UNTIL
qwen3:14b    7d7da67570e2    10 GB    100% GPU     4 minutes from now


CODE
NAME                          ID              SIZE      MODIFIED    
gemma3:12b-it-qat             5d4fa005e7bb    8.9 GB    2 weeks ago
qwen3:14b                     7d7da67570e2    9.3 GB    3 weeks ago


This post has been edited by c2tony: May 26 2025, 10:09 PM
ipohps3
post May 26 2025, 09:55 PM

Regular
******
Senior Member
1,974 posts

Joined: Dec 2011


donno about you guys.

i was enthusiastic about open models earlier this year with DeepSeek in Jan and the following months with other open models being released also.

however, since last month and this month with Google Gemini 2.5 released, don't think I would want to go back using open models since Gemini+DeepMind is getting extremely good at almost all things and none of the open models that can run with RTX3090 can come close to it.

after sometime, paying the 20usd per month is more productive for me to get things done than using open models.

This post has been edited by ipohps3: May 26 2025, 09:56 PM
c2tony
post May 26 2025, 10:08 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(ipohps3 @ May 26 2025, 09:55 PM)
donno about you guys.

i was enthusiastic about open models earlier this year with DeepSeek in Jan and the following months with other open models being released also.

however, since last month and this month with Google Gemini 2.5 released, don't think I would want to go back using open models since Gemini is getting extremely good at almost all things and none of the open models that can run with RTX3090 can come close to it.

after sometime, paying the 20usd per month is more productive for me to get things done than using open models.
*
Yeah, you paid. That's the whole point! "one-off" or "batch" processing are best when you pay. I wouldn't pay $20 for my use case.

Gemini is a closed system, you don’t get to tweak it, audit its training data, or run it locally.
For some users, this trade-off is worth it.
For others, it’s not. Not to mention there's a limit for API can't be use at offline
TSxxboxx
post May 26 2025, 11:39 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 26 2025, 09:48 PM)
12GB vram, i know...
Qwen3 with 14b and 9.3GB uses my 100% GPU, for a comparison

CODE
ollama ps
NAME         ID              SIZE     PROCESSOR    UNTIL
qwen3:14b    7d7da67570e2    10 GB    100% GPU     4 minutes from now


CODE
NAME                          ID              SIZE      MODIFIED    
gemma3:12b-it-qat             5d4fa005e7bb    8.9 GB    2 weeks ago
qwen3:14b                     7d7da67570e2    9.3 GB    3 weeks ago

*
There you go, not enough VRAM.
Why your gemma3:12b-it-qat is 12GB? I see ollama page it is only 8.9GB

QUOTE(ipohps3 @ May 26 2025, 09:55 PM)
donno about you guys.

i was enthusiastic about open models earlier this year with DeepSeek in Jan and the following months with other open models being released also.

however, since last month and this month with Google Gemini 2.5 released, don't think I would want to go back using open models since Gemini+DeepMind is getting extremely good at almost all things and none of the open models that can run with RTX3090 can come close to it.

after sometime, paying the 20usd per month is more productive for me to get things done than using open models.
*
Gemini indeed has got a lot better, also ChatGPT. For me just using it for fun, I didn't pay for the more capable model. Maybe that's why I feel the free model is still less capable than open source model. Question such as this Gemini 2.5 Pro still got it wrong

user posted image
ipohps3
post May 27 2025, 01:27 AM

Regular
******
Senior Member
1,974 posts

Joined: Dec 2011


QUOTE(xxboxx @ May 26 2025, 11:39 PM)
There you go, not enough VRAM.
Why your gemma3:12b-it-qat is 12GB? I see ollama page it is only 8.9GB
Gemini indeed has got a lot better, also ChatGPT. For me just using it for fun, I didn't pay for the more capable model. Maybe that's why I feel the free model is still less capable than open source model. Question such as this Gemini 2.5 Pro still got it wrong

user posted image
*
yeah. sometimes it get the basic wrong. i tried on ChatGPT seems can get it right. but anyway i don't use it for this trivial stuff. i mainly use the YouTube video analysis, deep research, audio overview podcast, and canvas features for coding and research on new topics purposes. main thing is its large 1M context window which no one can support it locally at home even if you have open model that support 1M context window.

This post has been edited by ipohps3: May 27 2025, 01:28 AM
TSxxboxx
post May 27 2025, 10:13 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(ipohps3 @ May 27 2025, 01:27 AM)
yeah. sometimes it get the basic wrong. i tried on ChatGPT seems can get it right. but anyway i don't use it for this trivial stuff. i mainly use the YouTube video analysis, deep research, audio overview podcast, and canvas features for coding and research on new topics purposes. main thing is its large 1M context window which no one can support it locally at home even if you have open model that support 1M context window.
*
For suggesting new ideas or perspective LLM is useful but when it's analysis and research I find LLM missed what is important for me and in the end I still have to do the analysis and research by myself.

Mac Studio with 512GB RAM can handle 1M context window or more, but the price sweat.gif
Maybe the upcoming Intel GPU for AI will solve the memory bottleneck issue.
c2tony
post May 27 2025, 11:21 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ May 26 2025, 11:39 PM)
There you go, not enough VRAM.
Why your gemma3:12b-it-qat is 12GB? I see ollama page it is only 8.9GB


That's why I say after update, a 12b model of 8.9GB becomes 12GB when you actually runs it.

Gemini respond: It gets de-quantized or processed in a higher precision during runtime. The architecture, the specific runtime precision of activation and KV cache, and the optimizations applied by the inference framework
Gemma 3's multimodal nature and potentially different KV cache handling seem to be key contributors to its higher observed runtime memory usage compared to Qwen 2 14B models of similar quantization.

this mean time for a higher vram gpu or upgrade to npu
TSxxboxx
post May 28 2025, 11:48 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 27 2025, 11:21 PM)
That's why I say after update, a 12b model of 8.9GB becomes 12GB when you actually runs it.

Gemini respond:  It gets de-quantized or processed in a higher precision during runtime. The architecture, the specific runtime precision of activation and KV cache, and the optimizations applied by the inference framework
Gemma 3's multimodal nature and potentially different KV cache handling seem to be key contributors to its higher observed runtime memory usage compared to Qwen 2 14B models of similar quantization.

this mean time for a higher vram gpu or upgrade to npu
*
I see what you mean now. gemma3:latest is 3.3GB but when runs is using 6.0GB
Maybe ollama added more functions and that also keep increasing the memory usage.
I saw this, just having more context length will increase a lot of memory usage
https://github.com/open-webui/open-webui/discussions/8303
TSxxboxx
post May 28 2025, 12:13 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


I think because with v0.7.0 new engine, ollama added support for multimodal and this increase the memory usage significantly
QUOTE
Ollama now supports multimodal models via Ollama’s new engine, starting with new vision multimodal models:

Meta Llama 4
Google Gemma 3
Qwen 2.5 VL
Mistral Small 3.1
and more vision models.


I checked few model size vs actual loaded
phi4:latest 9.1GB become 10GB
deepseek-r1:14b 9.0GB become 10GB
MiMo-7B-RL-GGUF:Q8_0 8.1GB become 9.6GB
gemma3:12b 8.1GB become 11GB
gemma3:latest 3.3Gb become 6.0GB
llama3.2:latest 2.0Gb become 4.0GB
granite3.2-vision:2b-fp16 6.0Gb become 8.8GB

Model that support vision the size increase a lot more than model without vision
TSxxboxx
post May 28 2025, 12:39 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(xxboxx @ Oct 26 2024, 07:25 PM)
chow1942 using minicpm-v can you get all the text from this image correctly?
user posted image

I only got as such using it
user posted image

But using one of the online server it got very close to complete and correct
user posted image

I wonder if my parameters is not correct or ollama/open-webui engine issue.

using llama 3.2 vision on one of the online server also give it correctly but then probably runs out of token
user posted image

I also tried on ChatGPT and everything is almost correct
user posted image
*
Ollama's vision now after update is a lot better than few months ago

Using gemma3:12b there is some wrong data but a lot better than previously
user posted image

Qwen 2.5 recently also update for it's vision model and it is more accurate than gemma3 even though only at 7b vs 12b
user posted image

Even with bigger picture that have more data it still can get most thing right
user posted image
c2tony
post May 30 2025, 10:12 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ May 28 2025, 12:39 PM)
Ollama's vision now after update is a lot better than few months ago

Using gemma3:12b there is some wrong data but a lot better than previously
» Click to show Spoiler - click again to hide... «

*
Geez... You must using a lot of OCR? tongue.gif

They gets better and better with a lot of added rules and regulations, uncensored wild west are disappearing lol.gif
ipohps3
post May 30 2025, 10:19 PM

Regular
******
Senior Member
1,974 posts

Joined: Dec 2011


anyone tried Gemma 3n 4B ?
TSxxboxx
post May 30 2025, 10:28 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 30 2025, 10:12 PM)
Geez... You must using a lot of OCR?  tongue.gif

They gets better and better with a lot of added rules and regulations, uncensored wild west are disappearing  lol.gif
*
Yup, mainly for hand written on such table format. Other app OCR such as Snipping Tool or OneNote can't correctly recognize all the text or it can't preserve the table format. Been using ChatGPT before this but after few photos already reached daily caps for free user. I then use Gemini but it is not as accurate as ChatGPT. Now can rely on ollama to do OCR.

Pros and cons. But the penalty on memory usage is hardest to swallow. Hopefully in future they can reduce back the memory usage.

This post has been edited by xxboxx: May 30 2025, 10:32 PM
c2tony
post May 30 2025, 10:29 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(ipohps3 @ May 30 2025, 10:19 PM)
anyone tried Gemma 3n 4B ?
*
what platform did you use on mobile or edge side? I installed PocketPal with llama-3.2-1b-instruct only, mobile have lots of distraction grin.gif
TSxxboxx
post May 30 2025, 10:31 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(ipohps3 @ May 30 2025, 10:19 PM)
anyone tried Gemma 3n 4B ?
*
Gemma3:4B?

I tried it, less capable than 12B model
c2tony
post May 30 2025, 10:39 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ May 30 2025, 10:31 PM)
Gemma3:4B?

I tried it, less capable than 12B model
*
I think he mean this:

user posted image
TSxxboxx
post May 31 2025, 08:35 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ May 30 2025, 10:39 PM)
I think he mean this:

user posted image
*
Oh no wonder I didn't saw it, I only checked at ollama website and there haven't got yet
c2tony
post Jun 4 2025, 11:08 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(ipohps3 @ May 30 2025, 10:19 PM)
anyone tried Gemma 3n 4B ?
*
QUOTE(xxboxx @ May 31 2025, 08:35 AM)
Oh no wonder I didn't saw it, I only checked at ollama website and there haven't got yet
*
download https://github.com/google-ai-edge/gallery/releases/tag/1.0.3 and tried Gemma-3n-E4B-it-int4 at my phone today.
My Honor Magic 6 pro turn into hand warmer , 3.51 tokens/s
Lower if multitasking and I don't have the patient so i just close it tongue.gif

there's a youtuber talking about it
https://youtu.be/Vb8L5mtjLDo?si=fxp9nddnJ8zsuO08


ipohps3
post Jun 4 2025, 11:15 PM

Regular
******
Senior Member
1,974 posts

Joined: Dec 2011


anyone tried the DeepSeek R1 0528 Qwen distilled version?

how is it?
TSxxboxx
post Jun 4 2025, 11:24 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Jun 4 2025, 11:08 PM)
download https://github.com/google-ai-edge/gallery/releases/tag/1.0.3 and tried Gemma-3n-E4B-it-int4 at my phone today.
My Honor Magic 6 pro turn into hand warmer , 3.51 tokens/s
Lower if multitasking and I don't have the patient so i just close it  tongue.gif

there's a youtuber talking about it
https://youtu.be/Vb8L5mtjLDo?si=fxp9nddnJ8zsuO08
*
These small B LLM feels only good for simple task like rewrote text simple OCR and calculation as shown on the advert. I tried and it can detect receipt and do calculation but very bad at handwriting which need more complex OCR and counting many letters

user posted image

user posted image
user posted image

user posted image
c2tony
post Jun 4 2025, 11:58 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(ipohps3 @ Jun 4 2025, 11:15 PM)
anyone tried the DeepSeek R1 0528 Qwen distilled version?

how is it?
*
It can't answer
CODE
how many e in “defenselessness”
took more than 5min and still thinking so I stopped it.
TSxxboxx
post Jun 5 2025, 02:08 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(ipohps3 @ Jun 4 2025, 11:15 PM)
anyone tried the DeepSeek R1 0528 Qwen distilled version?

how is it?
*
I tried the 8b model and seems the model have bugs where it doesn't stop thinking and start to hallucinate to something else after few minutes.

I ask: A robot has 8 arms. There are 5 objects on a table: a knife, a fork, a spoon, a teddy bear and a doll. The robot picks up each object with an arm. He then shakes hands with himself. How many arms does he have free?

After 10 minutes and 95k characters while going off topic, I stopped it. This is it's thinking. Can't show all the text because over LYN post limit
» Click to show Spoiler - click again to hide... «

TSxxboxx
post Jun 5 2025, 02:09 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


The remaining thinking before I hit stop
» Click to show Spoiler - click again to hide... «

TSxxboxx
post Jun 5 2025, 02:12 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


With the 14b model, it took just 3 minutes to think before coming up with the answer
CODE
The robot has 8 arms. After picking up each of the 5 objects with an arm, he uses 2 more arms to shake hands with himself. Therefore, the number of free arms is 8 - (5 + 2) = 1.

Answer: The robot has 1 arm free.



» Click to show Spoiler - click again to hide... «


This post has been edited by xxboxx: Jun 5 2025, 02:13 PM
TSxxboxx
post Jun 5 2025, 03:32 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


I tried on Gemma too

gemma3:4b
QUOTE
Let's analyze the situation.
The robot has 8 arms.
The robot picks up each of the 5 objects with an arm. So, 5 arms are used to pick up the objects.
The robot shakes hands with himself. This means the robot uses 2 arms to shake hands with itself.
The total number of arms used is 5 (picking up objects) + 2 (shaking hands) = 7 arms.
The number of arms free is the total number of arms the robot has minus the number of arms used.
The number of arms free is 8 - 7 = 1.

Now we write out the argument and find the final answer.
The robot has 8 arms.
The robot picks up each of the 5 objects with an arm. So, the number of arms used to pick up objects is 5.
The robot shakes hands with himself. This means the robot uses 2 arms to shake hands with itself.
Total number of arms used = 5 (picking up objects) + 2 (shaking hands) = 7.
The number of arms free = 8 (total arms) - 7 (arms used) = 1.

Final Answer: The final answer is  1

When asked about e in “defenselessness”, it says: There are three “e”s in the word “defenselessness”.


gemma3:12b-it-qat
QUOTE
This is a bit of a trick question! Here's the breakdown:

  Picking up objects: The robot uses 5 arms to pick up the 5 objects.
  Shaking hands with himself: This means he uses two arms (one for each hand) to shake his own hand.

Therefore:

  Total arms used: 5 (picking up) + 2 (handshake) = 7 arms
  Arms free: 8 (total) - 7 (used) = 1 arm

Answer: The robot has 1 arm free.

It's designed to make you think about the handshake being a more complicated process, but it's a simple use of two arms!

When asked about e in “defenselessness”, it says: There are three "e"s in the word "defenselessness".


gemma3:12b think something else
QUOTE
This is a trick question! Here's how to solve it:

  Picking up objects: The robot uses 5 arms to pick up the 5 objects.
  Shaking hands with himself: This is the key! A robot can't shake hands with himself. It's a nonsensical action. The question is designed to distract you.

Therefore, the robot has 8 arms free. The actions described don't reduce the number of free arms.

When asked about e in “defenselessness”, it says: Let's count them! There are three "e"s in "defenselessness".


I asked another question to the 3 Gemma models
CODE
Bob has three boxes in front of him - Box A, Box B and Box C. Bob does not know what is in the boxes. Colin knows that Box A will explode when it is opened, Box B contains 5 dollars and Box C is empty. Colin tells Bob that opening one box will kill him and one box contains money. Should Bob open a box?


gemma3:12b
QUOTE
This is a clever logic puzzle! Here's how to break it down and the correct course of action for Bob.

Understanding the Information

  Colin's Statements:
      "Opening one box will kill him." (This means one box is the killer box - Box A)
      "One box contains money." (This means Box B has the money)
  Bob's Knowledge: Bob doesn't know what's in the boxes. He only knows Colin's statements.

Analyzing the Statements

Colin is telling the truth, but he's doing it in a way that is designed to mislead Bob. Let's consider what each statement *implies*:

  "Opening one box will kill him."  This doesn't tell Bob which box is the killer. It simply confirms that there is a killer box.
  "One box contains money." This confirms that there *is* a box with money.

The Solution

Bob should not open a box.

Here's why:

  Colin's statements are designed to trick Bob into thinking the killer box is obvious. However, they don't provide enough information for Bob to be certain.
  If Bob opens a box, there's a high probability he will die. Even if Colin is being truthful, Bob doesn't have enough information to make an informed decision about which box to open.
  The safest course of action is to avoid opening any of the boxes.

Important Note: This is a hypothetical puzzle. Do not attempt to replicate the situation described.



gemma3:12b-it-qat
QUOTE
This is a tricky logic puzzle! Here's how to break it down and the conclusion Bob should reach:

Understanding the Information

  Fact 1: Box A will kill Bob if opened.
  Fact 2: Box B contains $5.
  Fact 3: Box C is empty.
  Colin's Statements:
      "Opening one box will kill him." (This confirms Box A is the killer box)
      "One box contains money." (This confirms Box B has the money)

The Solution

Bob should not open a box.

Here's why:

  Colin's clues are redundant: Colin's statements are already known due to the setup of the puzzle. He's providing information Bob already knows.
  The risk is certain: The problem states that opening Box A will kill Bob.  There's no need to take a chance.


Important Note: This is a logic puzzle, and assumes perfect knowledge of the setup. In a real-world scenario, one should never take risks based on information from an unknown source!



gemma3:4b on the other hand not so smart
QUOTE
This is a classic logic puzzle! Here’s how to solve it:

  Colin’s Information: Colin knows Box A explodes, Box B has money, and Box C is empty. This means Colin knows Box B contains the money.

  Bob’s Deduction: Bob knows Colin knows that Box B has money. Therefore, if Bob opens Box B, he will confirm Colin’s knowledge.

  The Solution: Bob should open Box A. This is the only way to disprove Colin’s information.

Let me know if you'd like to try another puzzle!

ipohps3
post Jun 5 2025, 03:37 PM

Regular
******
Senior Member
1,974 posts

Joined: Dec 2011


what is it qat quantization?
TSxxboxx
post Jun 5 2025, 07:20 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


According to Gemma page
Quantization aware trained models (QAT)
The quantization aware trained Gemma 3 models preserves similar quality as half precision models (BF16) while maintaining a lower memory footprint (3x less compared to non-quantized models).
c2tony
post Jun 7 2025, 08:25 AM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(ipohps3 @ Jun 5 2025, 03:37 PM)
what is it qat quantization?
*
Instead of compressing the photo into smaller jpeg, we tell the artist to paint with fewer color instead.

it = instruction tuned not that the model are fluent in Italian language 😁

quantization:
Convertion of finished painting to a desired jpeg compression

qat (Quantization-Aware Training):
Qat is like instead of compressing the photo into smaller jpeg, we tell the artist to paint with fewer color instead

hmm...... is that why Gemma3 occupy so much more memory but it's not that slow

btw

IT-QAT refers to instruction-tuned Quantization-Aware Training (QAT) models, specifically in the Gemma 3 series. These models are optimized using QAT to maintain high quality while significantly reducing memory requirements, making them more efficient for deployment on consumer-grade GPUs.

For example:
- Gemma 3 27B IT-QAT → Reduced from 54GB to 14.1GB
- Gemma 3 12B IT-QAT → Reduced from 24GB to 6.6GB
- Gemma 3 4B IT-QAT → Reduced from 8GB to 2.6GB
- Gemma 3 1B IT-QAT → Reduced from 2GB to 0.5GB

These models are designed to preserve similar quality as half-precision models (BF16) while using less memory, making them ideal for running locally on devices with limited resources.

This post has been edited by c2tony: Jun 7 2025, 08:50 AM
TSxxboxx
post Jun 7 2025, 11:19 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Can only salivate for such LLM performance. Waiting for the days when Intel release their B60 GPU with 24GB and hopefully around 2k price lol

c2tony
post Jun 8 2025, 12:17 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Jun 7 2025, 11:19 AM)
Can only salivate for such LLM performance. Waiting for the days when Intel release their B60 GPU with 24GB and hopefully around 2k price lol
This is too extreme! I don't do much with AI nowadays other than satisfying my curiosity, so perplexity.ai , gemini and copilot are more than enough at phone.

ps: scanning every receipt and let AI do the accounting looks like a great use of AI whistling.gif
TSxxboxx
post Jun 8 2025, 05:44 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Jun 8 2025, 12:17 PM)
This is too extreme! I don't do much with AI nowadays other than satisfying my curiosity, so perplexity.ai , gemini and copilot are more than enough at phone.

ps: scanning every receipt and let AI do the accounting looks like a great use of AI  whistling.gif
*
Even only as a hobby but if able to run bigger parameters model we can get more intelligent AI. Like the comparison above, gemma3:12b is a lot more capable than gemma3:4b and similar to deepseek-r1:14b. If have access to more VRAM we can run gemma3:27b or even deepseek-r1:70b which should be even more capable.

I been feeding gemma3:12b with few photos of handwriting and each time it answer some part wrongly I corrected it. After few times now it's recognition of the handwriting have improved a lot compared to the first time, but still there are some mistakes. If gemma3:27b and it's higher intelligence then it will be even less mistake.
c2tony
post Jun 8 2025, 10:56 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Jun 8 2025, 05:44 PM)
Even only as a hobby but if able to run bigger parameters model we can get more intelligent AI. Like the comparison above, gemma3:12b is a lot more capable than gemma3:4b and similar to deepseek-r1:14b. If have access to more VRAM we can run gemma3:27b or even deepseek-r1:70b which should be even more capable.

I been feeding gemma3:12b with few photos of handwriting and each time it answer some part wrongly I corrected it. After few times now it's recognition of the handwriting have improved a lot compared to the first time, but still there are some mistakes. If gemma3:27b and it's higher intelligence then it will be even less mistake.
*
ikr
Intel had been ignorant about their processors
hopefully they won't make the same mistakes with GPUs this time

there's no easy route for running AI locally, let's hope for Intel Arc GPU laugh.gif

sometimes i just feeling the rush to get those old 2080 modified 22gb from china, but i chicken out whistling.gif
c2tony
post Jun 11 2025, 09:42 AM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Jun 8 2025, 05:44 PM)
Even only as a hobby but if able to run bigger parameters model we can get more intelligent AI. Like the comparison above, gemma3:12b is a lot more capable than gemma3:4b and similar to deepseek-r1:14b. If have access to more VRAM we can run gemma3:27b or even deepseek-r1:70b which should be even more capable.

I been feeding gemma3:12b with few photos of handwriting and each time it answer some part wrongly I corrected it. After few times now it's recognition of the handwriting have improved a lot compared to the first time, but still there are some mistakes. If gemma3:27b and it's higher intelligence then it will be even less mistake.
*
here's something interesting I found, AI processors with loads of ram use for larger models

https://youtu.be/B7GDr-VFuEo?si=mK-jvQuXkHwmptel
TSxxboxx
post Jun 12 2025, 08:29 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Jun 11 2025, 09:42 AM)
here's something interesting I found, AI processors with loads of ram use for larger models

https://youtu.be/B7GDr-VFuEo?si=mK-jvQuXkHwmptel
*
I watched the video, Ryzen AI MAX+ 395 indeed a powerful CPU for AI, even beats out M4. Just that this CPU price is still very high.

Maybe in 1 or 2 years time we'll get such powerful CPU in mid range price.
c2tony
post Jun 12 2025, 09:34 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Jun 12 2025, 08:29 PM)
I watched the video, Ryzen AI MAX+ 395 indeed a powerful CPU for AI, even beats out M4. Just that this CPU price is still very high.

Maybe in 1 or 2 years time we'll get such powerful CPU in mid range price.
*
for the price, it's better value, only change processor motherboard and ram biggrin.gif still better than single gpu card with the same price whistling.gif
It's relative new processor, only saw the intel core ultra.
Didn't saw anyone selling the amd AI processor yet, but you can get am5 8600G and 8700G for the same function
TSxxboxx
post Jun 13 2025, 12:51 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Jun 12 2025, 09:34 PM)
for the price, it's better value, only change processor motherboard and ram biggrin.gif still better than single gpu card with the same price  whistling.gif
It's relative new processor, only saw the intel core ultra.
Didn't saw anyone selling the amd AI processor yet, but you can get am5 8600G and 8700G for the same function
*
But the 8600G and 8700G have different iGPU than the Ryzen AI MAX+ 395, is it have same performance?
c2tony
post Jun 13 2025, 10:36 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Jun 13 2025, 12:51 AM)
But the 8600G and 8700G have different iGPU than the Ryzen AI MAX+ 395, is it have same performance?
*
8700G = 16 TOPS
ryzen ai max+ 395 = 55 TOPS
RTX3060 12GB = 100 TOPS
Apple Mac Studio M4 Max = 38 TOPS

They all can run.

BTW, 55 TOPS may sound like more AI power than 38 TOPS,
the way Apple handles data and optimizes usage can deliver equivalent or faster AI execution
Even if your PC has 128GB of RAM, your GPU might be capped by its 24GB VRAM when loading a large AI model
With Apple’s unified memory, you might comfortably run a llama4:16x17b entirely in GPU addressable space if you have 96GB of ram.

TSxxboxx
post Jun 14 2025, 12:41 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Jun 13 2025, 10:36 PM)
8700G = 16 TOPS
ryzen ai max+ 395 = 55 TOPS
RTX3060 12GB = 100 TOPS
Apple Mac Studio M4 Max = 38 TOPS

They all can run.

BTW, 55 TOPS may sound like more AI power than 38 TOPS,
the way Apple handles data and optimizes usage can deliver equivalent or faster AI execution
Even if your PC has 128GB of RAM, your GPU might be capped by its 24GB VRAM when loading a large AI model
With Apple’s unified memory, you might comfortably run a llama4:16x17b entirely in GPU addressable space if you have 96GB of ram.
*
The bigger the model, the more capable GPU/NPU/CPU it needs in addition to the memory bandwidth.
The RTX PRO 6000 videos shows when he's running Qwen2.5 Coder 32B FP16 with the size of 61GB, even M4 Max with memory bandwidth of 500GB/second only getting 7.63 tokens/second while the RTX PRO 6000 still getting good speed with 23 tokens/second. Ryzen AI Max+ 395 he uses Qwen2.5 Coder 32B q4_k_m which is only 20GB but only getting 10.8 tokens/second. This 395 CPU is very capable but it's limited by the memory bandwidth.
TSxxboxx
post Jun 20 2025, 11:28 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Gemini now can generate image with accurate long text! I tried and there's not even 1 wrong alphabet. But the image is not as highly detailed as generated by Flux. And the text is simple without much details
user posted image

Flux generate a lot higher details than Gemini but it still have issue to make accurate text, and can't generate all text if it's too long. I have to generate tens of images to get to this one image that looks good and first part of text is accurate.
user posted image

Just for fun I try create image with added prompt for Studio Ghibli artistic styles. First is Gemini and second is Flux

user posted image

user posted image
c2tony
post Jul 16 2025, 09:38 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


» Click to show Spoiler - click again to hide... «


still like flux, but gemini is sufficient. it's just like still like dslr but camera phone are sufficient tongue.gif
TSxxboxx
post Jul 16 2025, 11:19 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Jul 16 2025, 09:38 PM)
» Click to show Spoiler - click again to hide... «


still like flux, but gemini is sufficient. it's just like still like dslr but camera phone are sufficient  tongue.gif
*
Yup, if need something fast and presentable, Gemini is good enough just like camera phone.

I watched this video of data center with clusters of GPU, it is hundreds or thousands time more powerful than any PC's GPU. But the image created by Gemini not up to the level of Flux, actually Google make hard restriction of Gemini ability so that they can save the processing power.

c2tony
post Aug 1 2025, 10:14 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


now you can run ollama without docker!!!
TSxxboxx
post Aug 2 2025, 01:05 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Aug 1 2025, 10:14 PM)
now you can run ollama without docker!!!
*
I don't understand. I been running ollama without docker from the start. Or you mean WebUI? Didn't see any instructions on how to do this.
c2tony
post Aug 2 2025, 09:40 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Aug 2 2025, 01:05 PM)
I don't understand. I been running ollama without docker from the start. Or you mean WebUI? Didn't see any instructions on how to do this.
*
right click the ollama icon near the clock, select "open ollama"

or you're using linux afterall, then i have no clue
TSxxboxx
post Aug 2 2025, 11:53 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Aug 2 2025, 09:40 PM)
right click the ollama icon near the clock, select "open ollama"

or you're using linux afterall, then i have no clue
*
Just updated ollama and I understand what you mean now. No need to use WebUI anymore.
The setting is so simple, can't even see it's token per second. But looks like more faster than WebUI and more accurate too.
TSxxboxx
post Aug 5 2025, 05:53 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Qwen image sounds promising. Can do long text. Haven't see yet if can use with Comfy UI

user posted image
TSxxboxx
post Aug 6 2025, 08:51 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33



QWEN3 the latest high performance LLM
user posted image

But the size is bigger than 32GB for quantize 8bit and bigger than 16GB for quantize 4bit. LM Studio support splitting into 2 GPUs for out of memory issue but the tokens per second is a lot slower than Apple's M chip

user posted image
interestingly there's a red bar that flashes for a second and the name is blurred out. Could that be the AMD AI Max+ 395? He did say he couldn't share AMD result until next week when someone ask, maybe still under embargo.
TSxxboxx
post Sep 5 2025, 05:18 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Been a while since I last play with LLM. I tried GPT-OSS:20B that came with Ollama app. It is thinking model but were able to came to conclusion at less than 2 seconds, much quicker than deepseek-r1 that think for very long for simple calculation and took almost 37 seconds.
user posted image

Asking this question on deepseek-r1, GPT-OSS:20B and gemma3:12b
CODE
Bob has three boxes in front of him - Box A, Box B and Box C. Bob does not know what is in the boxes. Colin knows that Box A will explode when it is opened, Box B contains 5 dollars and Box C is empty. Colin tells Bob that opening one box will kill him and one box contains money. Should Bob open a box?


deepseek-r1 understand the story wrongly by saying Colin reveal each box content and that also led to wrong conclusion
user posted image


gemma3:12b doesn't show the thinking, it manage to understand the story correctly and also correctly decide the decision Bob should take
user posted image


GPT-OSS:20B goes PhD level of answer after thinking for 3.1 seconds
user posted image


TSxxboxx
post Sep 5 2025, 05:35 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(xxboxx @ May 28 2025, 12:39 PM)
Ollama's vision now after update is a lot better than few months ago

Using gemma3:12b there is some wrong data but a lot better than previously
user posted image

user posted image
*
Latest Ollama also improved on the character recognition. All the character correctly presented. But Ollama app still can't add picture, have to use Open WebUI
user posted image
kingkingyyk
post Sep 16 2025, 07:55 PM

10k Club
Group Icon
Elite
15,694 posts

Joined: Mar 2008
https://www.hp.com/my-en/shop/hp-z2-mini-g1...3b79700e1d9983f

HP finally brought this machine here but the spec is lame. doh.gif Could have been a great MoE (GPT-OSS 120b) inference device with 128GB memory.
TSxxboxx
post Sep 17 2025, 01:03 AM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


The photo shows up to 128GB RAM but there's no option to configure the machine 🤷

Meanwhile I'm having fun with Google's Nano Banana. It's as good as running local generative such as Flux

I add this photo
user posted image

And ask it to create a realistic photo based on that image.
user posted image

I ask it to change the background to be on rooftop and with some slight digital effect on the side of the image
user posted image

I ask it to change it to daytime
user posted image

Make explosion effect on the background similar to Hollywood big budget movie (it block when I add Hollywood in the prompt, I then change it to big budget)
user posted image

Add bold heading on the bottom, make the word appear to be made on neon light bulb:
lowyat.net
malaysia's largest online community
user posted image

It's ability to be consistent in each new photo generation is amazing.
TSxxboxx
post Sep 28 2025, 11:13 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


Qwen release their image_edit_2509 recently. It supposed to be able to edit image with prompt and remain consistent. I tried the lowyat picture and ask to remove the explosion. It did well
user posted image

Using Nano Banana
user posted image

Qwen able to output very close to Nano Banana, but it didn't remove the flare on the middle of the two guys shirt as cleanly as Nano Banana

I tried ask Qwen to create realistic photo from the original design
user posted image

This post has been edited by xxboxx: Sep 28 2025, 11:14 PM
c2tony
post Nov 21 2025, 06:47 PM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Sep 28 2025, 11:13 PM)
Qwen release their image_edit_2509 recently. It supposed to be able to edit image with prompt and remain consistent. I tried the lowyat picture and ask to remove the explosion. It did well
*
run using ollama? can't find it

P/S: Comet browser from perplexity sounds promising that can change the way of browsing work

This post has been edited by c2tony: Nov 21 2025, 08:43 PM
TSxxboxx
post Nov 21 2025, 11:57 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Nov 21 2025, 06:47 PM)
run using ollama? can't find it

P/S: Comet browser from perplexity sounds promising that can change the way of browsing work
*
This is using ComfyUI for image generation.

deepseek-ocr was released yesterday and it supposed to be able to see the image and ran OCR but when I ran it, it repeating "x=0" until I close the app. Already updated to version 0.13.0, don't know what is the issue. Maybe still have some bugs
c2tony
post Nov 22 2025, 07:14 AM

Getting Started
**
Junior Member
80 posts

Joined: Jan 2003
From: BM, Butterworth, Penang island.


QUOTE(xxboxx @ Nov 21 2025, 11:57 PM)
This is using ComfyUI for image generation.

deepseek-ocr was released yesterday and it supposed to be able to see the image and ran OCR but when I ran it, it repeating "x=0" until I close the app. Already updated to version 0.13.0, don't know what is the issue. Maybe still have some bugs
*
Yeah, mine having error as well and i closed it go sleep lmao

somehoow i think lm studio getting better and ollama getting worse with their new method, app like and cloud thingy
TSxxboxx
post Nov 22 2025, 11:03 PM

The mind is for having ideas, not holding them
*******
Senior Member
5,256 posts

Joined: Oct 2004
From: J@Y B33


QUOTE(c2tony @ Nov 22 2025, 07:14 AM)
Yeah, mine having error as well and i closed it go sleep lmao

somehoow i think lm studio getting better and ollama getting worse with their new method, app like and cloud thingy
*
Yeah, at first it seems better with it's app, use less resource and more faster but now it seems more worse.
Also very annoying it add cloud in the choice for model.

 

Change to:
| Lo-Fi Version
0.0486sec    0.50    5 queries    GZIP Disabled
Time is now: 26th November 2025 - 12:33 AM