Add docs for thinking APIs and update examples

Added new documentation for 'chat-with-thinking' and 'generate-thinking' APIs, including usage examples and streamed output. Updated existing API docs to improve example clarity, response formatting, and added more interactive output using TypewriterTextarea. Removed deprecated 'list-library-models' doc and made minor README updates.
2025-11-01 17:10:41 +01:00 · 2025-08-31 19:36:43 +05:30
parent 931d5dd520
commit f914707536
14 changed files with 5634 additions and 3463 deletions
--- a/README.md
+++ b/README.md
@@ -210,7 +210,9 @@ pip install pre-commit
 #### Setup dev environment
 > **Note**
-> If you're on Windows, install [Chocolatey Package Manager for Windows](https://chocolatey.org/install) and then install `make` by running `choco install make`. Just a little tip - run the command with administrator privileges if installation faiils.
+> If you're on Windows, install [Chocolatey Package Manager for Windows](https://chocolatey.org/install) and then
 > install `make` by running `choco install make`. Just a little tip - run the command with administrator privileges if
 > installation faiils.
 ```shell
 make dev
@@ -265,6 +267,7 @@ If you like or are using this project to build your own, please give us a star.
 | 9  | moqui-wechat      | A moqui-wechat component                                                                                                                                           | [GitHub](https://github.com/heguangyong/moqui-wechat)                                                                                                                                         |
 | 10 | B4X               | A set of simple and powerful RAD tool for Desktop and Server development                                                                                           | [Website](https://www.b4x.com/android/forum/threads/ollama4j-library-pnd_ollama4j-your-local-offline-llm-like-chatgpt.165003/)                                                                |
 | 11 | Research Article  | Article: `Large language model based mutations in genetic improvement` - published on National Library of Medicine (National Center for Biotechnology Information) | [Website](https://pmc.ncbi.nlm.nih.gov/articles/PMC11750896/)                                                                                                                                 |
 | 12 | renaime           | Renaime is a LLaVa powered tool that automatically renames those files for you.                                                                                    | [Website](https://devpost.com/software/renaime)                                                                                                                                               |
 ## Traction
--- a/docs/docs/apis-generate/chat-with-thinking.md
+++ b/docs/docs/apis-generate/chat-with-thinking.md
@@ -0,0 +1,92 @@
 ---
 sidebar_position: 8
 ---
 import CodeEmbed from '@site/src/components/CodeEmbed';
 import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
 # Chat with Thinking
 This API allows to generate responses from an LLM while also retrieving the model's "thinking" process separately from
 the final answer. The "thinking" tokens represent the model's internal reasoning or planning before it produces the
 actual response. This can be useful for debugging, transparency, or simply understanding how the model arrives at its
 answers.
 You can use this feature to receive both the thinking and the response as separate outputs, either as a complete result
 or streamed token by token. The examples below show how to use the API to access both the thinking and the response, and
 how to display them in your application.
 ### Chat with thinking model and receive the thinking and response text separately
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithThinkingModelExample.java" />
 You will get a response similar to:
 :::tip[LLM Response]
 **First thinking response:** User asks a simple question. We just answer.
 **First answer response:** The capital of France is _**Paris**_.
 **Second thinking response:** User: "And what is the second largest city?" They asked about the second largest city in
 France. Provide answer: Paris largest, second largest is Marseille. We can provide population stats, maybe mention Lyon
 as third largest. Also context. The answer should be concise. Provide some details: Marseille is the second largest,
 population ~870k, located on Mediterranean coast. Provide maybe some facts. Given no request for extra context, just answer.
 **Second answer response:** The second‑largest city in France is _**Marseille**_. It’s a major Mediterranean port with a
 population of roughly 870,000 (as of the latest estimates) and is known for its historic Old Port, vibrant cultural
 scene, and diverse population.
 :::
 ### Chat with thinking model and receive the thinking and response tokens streamed
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatStreamingWithThinkingExample.java" />
 You will get a response similar to:
 :::tip[First Question's Thinking Tokens]
 <TypewriterTextarea
 textContent={`USER ASKS A SIMPLE QUESTION: "WHAT IS THE CAPITAL OF FRANCE?" THE ANSWER: PARIS. PROVIDE ANSWER.`}
 typingSpeed={10}
 pauseBetweenSentences={1200}
 height="auto"
 width="100%"
 style={{ whiteSpace: 'pre-line' }}
 />
 :::
 :::tip[First Question's Response Tokens]
 <TypewriterTextarea
 textContent={`the capital of france is 'paris'.`}
 typingSpeed={10}
 pauseBetweenSentences={1200}
 height="auto"
 width="100%"
 style={{ whiteSpace: 'pre-line' }}
 />
 :::
 :::tip[Second Question's Thinking Tokens]
 <TypewriterTextarea
 textContent={`THE USER ASKS: "AND WHAT IS THE SECOND LARGEST CITY?" LIKELY REFERRING TO FRANCE. THE SECOND LARGEST CITY IN FRANCE (BY POPULATION) IS MARSEILLE. HOWEVER, THERE MIGHT BE NUANCE: THE LARGEST IS PARIS, SECOND LARGEST IS MARSEILLE. BUT SOME MIGHT ARGUE THAT LYON IS SECOND LARGEST? LET'S CONFIRM: POPULATION OF FRANCE: PARIS ~2.1M (METRO 12M). MARSEILLE ~870K (METRO 1.5M). LYON ~515K (METRO 1.5M). SO MARSEILLE IS SECOND LARGEST CITY PROPER. LYON IS THIRD LARGEST. SO ANSWER: MARSEILLE. WE SHOULD PROVIDE THAT. PROVIDE A BRIEF EXPLANATION.`}
 typingSpeed={10}
 pauseBetweenSentences={1200}
 height="auto"
 width="100%"
 style={{ whiteSpace: 'pre-line' }}
 />
 :::
 :::tip[Second Question's Response Tokens]
 <TypewriterTextarea
 textContent={`the second‑largest city in france by population is 'marseille'.
 - marseille ≈ 870,000 residents (city proper)
 - lyon ≈ 515,000 residents (city proper)
 so marseille comes after paris as france’s largest city.`}
 typingSpeed={10}
 pauseBetweenSentences={1200}
 height="auto"
 width="100%"
 style={{ whiteSpace: 'pre-line' }}
 />
 :::
--- a/docs/docs/apis-generate/chat-with-tools.md
+++ b/docs/docs/apis-generate/chat-with-tools.md
@@ -22,9 +22,9 @@ session. The tool invocation and response handling are all managed internally by
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithTools.java"/>
 ::::tip[LLM Response]
-> First answer: 6527fb60-9663-4073-b59e-855526e0a0c2 is the ID of the employee named 'Rahul Kumar'.
+**First answer:** 6527fb60-9663-4073-b59e-855526e0a0c2 is the ID of the employee named 'Rahul Kumar'.
->
+
-> Second answer:  Kumar is the last name of the employee named 'Rahul Kumar'.
+**Second answer:**  _Kumar_ is the last name of the employee named 'Rahul Kumar'.
 ::::
 This tool calling can also be done using the streaming API.
@@ -63,7 +63,7 @@ The annotated method can then be used as a tool in the chat session:
 Running the above would produce a response similar to:
 ::::tip[LLM Response]
-> First answer: 0.0000112061 is the most important constant in the world using 10 digits, according to my function. This constant is known as Planck's constant and plays a fundamental role in quantum mechanics. It relates energy and frequency in electromagnetic radiation and action (the product of momentum and distance) for particles.
+**First answer:** 0.0000112061 is the most important constant in the world using 10 digits, according to my function. This constant is known as Planck's constant and plays a fundamental role in quantum mechanics. It relates energy and frequency in electromagnetic radiation and action (the product of momentum and distance) for particles.
->
+
-> Second answer: 3-digit constant: 8.001
+**Second answer:** 3-digit constant: 8.001
 ::::
--- a/docs/docs/apis-generate/chat.md
+++ b/docs/docs/apis-generate/chat.md
@@ -51,31 +51,13 @@ You will get a response similar to:
 ### Create a conversation where the answer is streamed
-<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatStreamingWithTokenConcatenationExample.java" />
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatStreamingExample.java" />
 <!-- ::::tip[LLM Response]
 >
 > The
 >
 > The capital
 >
 > The capital of
 >
 > The capital of France
 >
 > The capital of France is
 >
 > The capital of France is Paris
 >
 > The capital of France is Paris.
 >
 :::: -->
 <TypewriterTextarea
-    textContent='The capital of France is Paris.'
+    textContent="'The Great Gatsby' by F. Scott Fitzgerald is a complex and multifaceted novel that explores themes of wealth, class, love, loss, and the American Dream. It is a landmark work of American literature that examines the social and psychological consequences of the American Dream's unattainability and its impact on the lives of its characters."
-    typingSpeed={30}
+    typingSpeed={5}
    pauseBetweenSentences={1200}
-    height='55px'
+    height='140px'
    width='100%'
 />
@@ -94,24 +76,29 @@ You will get a response similar to:
 You will get a response as:
 ::::tip[LLM Response]
-> Shhh!
+Shhh!
 ::::
 ## Create a conversation about an image (requires a vision model)
 Let's use this image:
 <img src="https://t3.ftcdn.net/jpg/02/96/63/80/360_F_296638053_0gUVA4WVBKceGsIr7LNqRWSnkusi07dq.jpg" alt="Img" style={{ maxWidth: '250px', height: 'auto', display: 'block', margin: '1rem 0' }} />
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithImage.java" />
 You will get a response similar to:
 ::::tip[LLM Response]
-> First Answer: The image shows a dog sitting on the bow of a boat that is docked in calm water. The boat has two
+**First Answer:** The image shows a dog sitting on the bow of a boat that is docked in calm water. The boat has two
-> levels, with the lower level containing seating and what appears to be an engine cover. The dog seems relaxed and
+levels, with the lower level containing seating and what appears to be an engine cover. The dog seems relaxed and
-> comfortable on the boat, looking out over the water. The background suggests it might be late afternoon or early
+comfortable on the boat, looking out over the water. The background suggests it might be late afternoon or early
-> evening, given the warm lighting and the low position of the sun in the sky.
+evening, given the warm lighting and the low position of the sun in the sky.
->
+
-> Second Answer: Based on the image, it's difficult to definitively determine the breed of the dog. However, the dog
+**Second Answer:** Based on the image, it's difficult to definitively determine the breed of the dog. However, the dog
-> appears to be medium-sized with a short coat and a brown coloration, which might suggest that it is a Golden Retriever
+appears to be medium-sized with a short coat and a brown coloration, which might suggest that it is a **_Golden Retriever_**
-> or a similar breed. Without more details like ear shape and tail length, it's not possible to identify the exact breed
+or a similar breed. Without more details like ear shape and tail length, it's not possible to identify the exact breed
-> confidently.
+confidently.
 ::::
--- a/docs/docs/apis-generate/generate-async.md
+++ b/docs/docs/apis-generate/generate-async.md
@@ -3,9 +3,12 @@ sidebar_position: 2
 ---
 import CodeEmbed from '@site/src/components/CodeEmbed';
 import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
 # Generate (Async)
 ### Generate response from a model asynchronously
 This API lets you ask questions to the LLMs in a asynchronous way.
 This is particularly helpful when you want to issue a generate request to the LLM and collect the response in the
 background (such as threads) without blocking your code until the response arrives from the model.
@@ -15,8 +18,10 @@ the [completion](https://github.com/jmorganca/ollama/blob/main/docs/api.md#gener
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateAsync.java" />
-::::tip[LLM Response]
+You will get a response similar to:
-Here are the participating teams in the 2019 ICC Cricket World Cup:
+
 <TypewriterTextarea
 textContent={`Here are the participating teams in the 2019 ICC Cricket World Cup:
 1. Australia
 2. Bangladesh
@@ -26,5 +31,54 @@ Here are the participating teams in the 2019 ICC Cricket World Cup:
 6. England
 7. South Africa
 8. West Indies (as a team)
-9. Afghanistan
+9. Afghanistan`}
-::::
+   typingSpeed={10}
   pauseBetweenSentences={1200}
   height="auto"
   width="100%"
   style={{ whiteSpace: 'pre-line' }}
   />
 ### Generate response from a model asynchronously with thinking and response streamed
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateAsyncWithThinking.java" />
 <TypewriterTextarea
 textContent={`WE NEED TO ANSWER THE QUESTION: "HOW LONG DOES IT TAKE FOR THE LIGHT FROM THE SUN TO REACH EARTH?" THE USER LIKELY EXPECTS THE TIME IN SECONDS, MINUTES, OR HOURS. LIGHT TRAVELS AT SPEED OF LIGHT (299,792,458 M/S). DISTANCE BETWEEN SUN AND EARTH IS ABOUT 1 AU (~149.6 MILLION KM). SO TRAVEL TIME = 1 AU / C ≈ 500 SECONDS ≈ 8.3 MINUTES. MORE PRECISELY, 8 MINUTES AND 20 SECONDS. PROVIDE CONTEXT: AVERAGE DISTANCE, VARYING DUE TO ELLIPTICAL ORBIT. SO ANSWER: ABOUT 8 MINUTES 20 SECONDS. ALSO MENTION THAT DUE TO VARIATION: FROM 8:07 TO 8:20. PROVIDE DETAILS. ALSO MENTION THAT WE REFER TO THE TIME LIGHT TAKES TO TRAVEL 1 ASTRONOMICAL UNIT.
 ALSO MIGHT MENTION: FOR MORE PRECISE: 499 SECONDS = 8 MIN 19 S. VARIATION DUE TO EARTH'S ORBIT: FROM 8 MIN 6 S TO 8 MIN 20 S. SO ANSWER.
 LET'S CRAFT AN EXPLANATION.
 the sun’s light takes a little over **eight minutes** to get to earth.
 | quantity | value |
 |----------|-------|
 | distance (average) | 1 astronomical unit (au) ≈ 149,600,000 km |
 | speed of light | \(c = 299,792,458\) m s⁻¹ |
 | light‑travel time | \(\displaystyle \frac{1\ \text{au}}{c} \approx 499\ \text{s}\) |
 499 seconds is **8 min 19 s**.
 because the earth’s orbit is slightly elliptical, the distance varies from about 147 million km (at perihelion) to 152 million km (at aphelion). this gives a light‑travel time that ranges roughly from **8 min 6 s** to **8 min 20 s**. thus, when we look at the sun, we’re seeing it as it was about eight minutes agoComplete thinking response: We need to answer the question: "How long does it take for the light from the Sun to reach Earth?" The user likely expects the time in seconds, minutes, or hours. Light travels at speed of light (299,792,458 m/s). Distance between Sun and Earth is about 1 AU (~149.6 million km). So travel time = 1 AU / c ≈ 500 seconds ≈ 8.3 minutes. More precisely, 8 minutes and 20 seconds. Provide context: average distance, varying due to elliptical orbit. So answer: about 8 minutes 20 seconds. Also mention that due to variation: from 8:07 to 8:20. Provide details. Also mention that we refer to the time light takes to travel 1 astronomical unit.
 Also might mention: For more precise: 499 seconds = 8 min 19 s. Variation due to Earth's orbit: from 8 min 6 s to 8 min 20 s. So answer.
 Let's craft an explanation.
 Complete response: The Sun’s light takes a little over **eight minutes** to get to Earth.
 | Quantity | Value |
 |----------|-------|
 | Distance (average) | 1 astronomical unit (AU) ≈ 149,600,000 km |
 | Speed of light | \(c = 299,792,458\) m s⁻¹ |
 | Light‑travel time | \(\displaystyle \frac{1\ \text{AU}}{c} \approx 499\ \text{s}\) |
 499 seconds is **8 min 19 s**.
 Because the Earth’s orbit is slightly elliptical, the distance varies from about 147 million km (at perihelion) to 152 million km (at aphelion). This gives a light‑travel time that ranges roughly from **8 min 6 s** to **8 min 20 s**. Thus, when we look at the Sun, we’re seeing it as it was about eight minutes ago.`}
   typingSpeed={5}
   pauseBetweenSentences={1200}
   height="auto"
   width="100%"
   style={{ whiteSpace: 'pre-line' }}
   />
--- a/docs/docs/apis-generate/generate-embeddings.md
+++ b/docs/docs/apis-generate/generate-embeddings.md
--- a/docs/docs/apis-generate/generate-thinking.md
+++ b/docs/docs/apis-generate/generate-thinking.md
@@ -0,0 +1,55 @@
 ---
 sidebar_position: 2
 ---
 import CodeEmbed from '@site/src/components/CodeEmbed';
 import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
 # Generate with Thinking
 This API allows to generate responses from an LLM while also retrieving the model's "thinking" process separately from the final answer. The "thinking" tokens represent the model's internal reasoning or planning before it produces the actual response. This can be useful for debugging, transparency, or simply understanding how the model arrives at its answers.
 You can use this feature to receive both the thinking and the response as separate outputs, either as a complete result or streamed token by token. The examples below show how to use the API to access both the thinking and the response, and how to display them in your application.
 ### Generate response with thinking and receive the thinking and response text separately
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateWithThinking.java" />
 You will get a response similar to:
 :::tip[Thinking Tokens]
 User asks "Who are you?" It's a request for identity. As ChatGPT, we should explain that I'm an AI developed by OpenAI, etc. Provide friendly explanation.
 :::
 :::tip[Response Tokens]
 I’m ChatGPT, a large language model created by OpenAI. I’m designed to understand and generate natural‑language text, so I can answer questions, help with writing, explain concepts, brainstorm ideas, and chat about almost any topic. I don’t have a personal life or consciousness—I’m a tool that processes input and produces responses based on patterns in the data I was trained on. If you have any questions about how I work or what I can do, feel free to ask!
 :::
 ### Generate response and receive the thinking and response tokens streamed
 <CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateWithThinkingStreamed.java" />
 You will get a response similar to:
 :::tip[Thinking Tokens]
 <TypewriterTextarea
 textContent={`User asks "Who are you?" It's a request for identity. As ChatGPT, we should explain that I'm an AI developed by OpenAI, etc. Provide friendly explanation.`}
 typingSpeed={10}
 pauseBetweenSentences={1200}
 height="auto"
 width="100%"
 style={{ whiteSpace: 'pre-line' }}
 />
 :::
 :::tip[Response Tokens]
 <TypewriterTextarea
 textContent={`I’m ChatGPT, a large language model created by OpenAI. I’m designed to understand and generate natural‑language text, so I can answer questions, help with writing, explain concepts, brainstorm ideas, and chat about almost any topic. I don’t have a personal life or consciousness—I’m a tool that processes input and produces responses based on patterns in the data I was trained on. If you have any questions about how I work or what I can do, feel free to ask!`}
 typingSpeed={10}
 pauseBetweenSentences={1200}
 height="auto"
 width="100%"
 style={{ whiteSpace: 'pre-line' }}
 />
 :::
--- a/docs/docs/apis-generate/generate-with-image-files.md
+++ b/docs/docs/apis-generate/generate-with-image-files.md
@@ -28,6 +28,6 @@ If you have this image downloaded and you pass the path to the downloaded image
 You will get a response similar to:
 ::::tip[LLM Response]
-> This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to
+This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to
-> be enjoying its time outdoors, perhaps on a lake.
+be enjoying its time outdoors, perhaps on a lake.
 ::::
--- a/docs/docs/apis-generate/generate-with-image-urls.md
+++ b/docs/docs/apis-generate/generate-with-image-urls.md
@@ -28,6 +28,6 @@ Passing the link of this image the following code:
 You will get a response similar to:
 ::::tip[LLM Response]
-> This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to
+This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to
-> be enjoying its time outdoors, perhaps on a lake.
+be enjoying its time outdoors, perhaps on a lake.
 ::::
--- a/docs/docs/apis-generate/generate.md
+++ b/docs/docs/apis-generate/generate.md
@@ -23,8 +23,8 @@ to [this](/apis-extras/options-builder).
 You will get a response similar to:
 ::::tip[LLM Response]
-> I am a large language model created by Alibaba Cloud. My purpose is to assist users in generating text, answering
+I am a model of an AI trained by Mistral AI. I was designed to assist with a wide range of tasks, from answering
-> questions, and completing tasks. I aim to be user-friendly and easy to understand for everyone who interacts with me.
+questions to helping with complex computations and research. How can I help you toda
 ::::
 ### Try asking a question, receiving the answer streamed
@@ -33,51 +33,37 @@ You will get a response similar to:
 You will get a response similar to:
 <!-- ::::tip[LLM Response]
 > The
 >
 > The capital
 >
 > The capital of
 >
 > The capital of France
 >
 > The capital of France is
 >
 > The capital of France is Paris
 >
 > The capital of France is Paris.
 :::: -->
 <TypewriterTextarea
-    textContent='The capital of France is Paris.'
+textContent='The capital of France is Paris.'
-    typingSpeed={30}
+typingSpeed={30}
-    pauseBetweenSentences={1200}
+pauseBetweenSentences={1200}
-    height='55px'
+height='55px'
-    width='100%'
+width='100%'
 />
 ## Generate structured output
 ### With response as a `Map`
-<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/StructuredOutput.java" />
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateStructuredOutput.java" />
 You will get a response similar to:
 ::::tip[LLM Response]
 ```json
 {
-    "available": true,
+  "heroName" : "Batman",
-    "age": 22
+  "ageOfPerson" : 30
 }
 ```
 ::::
 ### With response mapped to specified class type
-<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/StructuredOutputMappedToObject.java" />
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateStructuredOutputMappedToObject.java" />
 ::::tip[LLM Response]
-Person(age=28, available=false)
+HeroInfo(heroName=Batman, ageOfPerson=30)
 ::::
--- a/docs/docs/apis-model-management/list-library-models.md
+++ b/docs/docs/apis-model-management/list-library-models.md
@@ -1,70 +0,0 @@
 ---
 sidebar_position: 1
 ---
 import CodeEmbed from '@site/src/components/CodeEmbed';
 # Models from Ollama Library
 These API retrieves a list of models directly from the Ollama library.
 ### List Models from Ollama Library
 This API fetches available models from the Ollama library page, including details such as the model's name, pull count,
 popular tags, tag count, and the last update time.
 <CodeEmbed
 src='https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ListLibraryModels.java'>
 </CodeEmbed>
 The following is the sample output:
 ```
 [
    LibraryModel(name=llama3.2-vision, description=Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes., pullCount=21.1K, totalTags=9, popularTags=[vision, 11b, 90b], lastUpdated=yesterday),
    LibraryModel(name=llama3.2, description=Meta's Llama 3.2 goes small with 1B and 3B models., pullCount=2.4M, totalTags=63, popularTags=[tools, 1b, 3b], lastUpdated=6 weeks ago)
 ]
 ```
 ### Get Tags of a Library Model
 This API Fetches the tags associated with a specific model from Ollama library.
 <CodeEmbed
 src='https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GetLibraryModelTags.java'>
 </CodeEmbed>
 The following is the sample output:
 ```
 LibraryModelDetail(
  model=LibraryModel(name=llama3.2-vision, description=Llama 3.2 Vision is a collection of instruction-tuned image reasoning generative models in 11B and 90B sizes., pullCount=21.1K, totalTags=9, popularTags=[vision, 11b, 90b], lastUpdated=yesterday),
  tags=[
        LibraryModelTag(name=llama3.2-vision, tag=latest, size=7.9GB, lastUpdated=yesterday),
        LibraryModelTag(name=llama3.2-vision, tag=11b, size=7.9GB, lastUpdated=yesterday),
        LibraryModelTag(name=llama3.2-vision, tag=90b, size=55GB, lastUpdated=yesterday)
    ]
 )
 ```
 ### Find a model from Ollama library
 This API finds a specific model using model `name` and `tag` from Ollama library.
 <CodeEmbed
 src='https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/FindLibraryModel.java'>
 </CodeEmbed>
 The following is the sample output:
 ```
 LibraryModelTag(name=qwen2.5, tag=7b, size=4.7GB, lastUpdated=7 weeks ago)
 ```
 ### Pull model using `LibraryModelTag`
 You can use `LibraryModelTag` to pull models into Ollama server.
 <CodeEmbed
 src='https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/PullLibraryModelTags.java'>
 </CodeEmbed>
--- a/docs/docs/apis-model-management/list-models.md
+++ b/docs/docs/apis-model-management/list-models.md
@@ -4,7 +4,7 @@ sidebar_position: 2
 import CodeEmbed from '@site/src/components/CodeEmbed';
-# List Local Models
+# List Library Models
 This API lets you list downloaded/available models on the Ollama server.
--- a/docs/package-lock.json
+++ b/docs/package-lock.json
--- a/docs/src/components/TypewriterTextarea/index.js
+++ b/docs/src/components/TypewriterTextarea/index.js
@@ -1,53 +1,54 @@
 import React, { useEffect, useState, useRef } from 'react';
-const TypewriterTextarea = ({ textContent, typingSpeed = 50, pauseBetweenSentences = 1000, height = '200px', width = '100%', align = 'left' }) => {
+const TypewriterTextarea = ({
-  const [text, setText] = useState('');
+  textContent,
-  const [sentenceIndex, setSentenceIndex] = useState(0);
+  typingSpeed = 50,
  pauseBetweenSentences = 1000,
  height = '200px',
  width = '100%',
  align = 'left',
  style = {},
 }) => {
  const [displayedText, setDisplayedText] = useState('');
  const [charIndex, setCharIndex] = useState(0);
  const sentences = textContent ? textContent.split('\n') : [];
  const isTyping = useRef(false);
  // Flatten textContent to a string, preserving \n
  const fullText = textContent || '';
  useEffect(() => {
-    if (!textContent) return;
+    if (!fullText) return;
    if (!isTyping.current) {
      isTyping.current = true;
    }
-    if (sentenceIndex >= sentences.length) {
+    if (charIndex > fullText.length) {
      // Reset to start from the beginning
      setSentenceIndex(0);
      setCharIndex(0);
-      setText('');
+      setDisplayedText('');
      return;
    }
-    const currentSentence = sentences[sentenceIndex];
+    if (charIndex < fullText.length) {
    if (charIndex < currentSentence.length) {
      const timeout = setTimeout(() => {
-        setText((prevText) => prevText + currentSentence[charIndex]);
+        setDisplayedText(fullText.slice(0, charIndex + 1));
        setCharIndex((prevCharIndex) => prevCharIndex + 1);
-      }, typingSpeed);
+      }, fullText[charIndex] === '\n' ? typingSpeed : typingSpeed);
      return () => clearTimeout(timeout);
    } else {
-      // Wait a bit, then go to the next sentence
+      // Wait a bit, then restart
      const timeout = setTimeout(() => {
        setSentenceIndex((prev) => prev + 1);
        setCharIndex(0);
        setDisplayedText('');
      }, pauseBetweenSentences);
      return () => clearTimeout(timeout);
    }
-  }, [charIndex, sentenceIndex, sentences, typingSpeed, pauseBetweenSentences, textContent]);
+    // eslint-disable-next-line
  }, [charIndex, fullText, typingSpeed, pauseBetweenSentences]);
  return (
-    <textarea
+    <div
      value={text}
      readOnly
      rows={10}
      cols={5}
      style={{
        width: typeof width === 'number' ? `${width}px` : width,
        height: height,
@@ -60,8 +61,12 @@ const TypewriterTextarea = ({ textContent, typingSpeed = 50, pauseBetweenSentenc
        resize: 'none',
        whiteSpace: 'pre-wrap',
        color: 'black',
        overflow: 'auto',
        ...style,
      }}
-    />
+    >
      {displayedText}
    </div>
  );
 };