Update Javadoc in Agent class to reflect changes in the interact method signature, now including an OllamaChatStreamObserver parameter for improved conversation handling.

Enhance agent documentation with detailed YAML configuration instructions and benefits. Update CodeEmbed component to support customizable language for syntax highlighting. Refactor Agent class to improve Javadoc comments and method signatures for better clarity and functionality.
Update Javadoc comments in Ollama and Agent classes to reflect correct method references for chat request construction and agent instantiation.
2025-10-26 14:10:43 +01:00 · 2025-10-20 22:31:04 +05:30 · 2025-10-20 20:07:15 +05:30 · 2025-10-19 14:34:21 +05:30 · 2025-10-19 14:03:10 +05:30 · 2025-10-19 11:22:16 +05:30
190 changed files with 30418 additions and 19876 deletions
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -0,0 +1,10 @@
+# See https://docs.github.com/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners
+
+# Default owners for everything in the repo
+* @amithkoujalgi
+
+# Example for scoping ownership (uncomment and adjust as teams evolve)
+# /docs/ @amithkoujalgi
+# /src/ @amithkoujalgi
+
+
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -0,0 +1,59 @@
+name: Bug report
+description: File a bug report
+labels: [bug]
+assignees: []
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for taking the time to fill out this bug report!
+  - type: input
+    id: version
+    attributes:
+      label: ollama4j version
+      description: e.g., 1.1.0
+      placeholder: 1.1.0
+    validations:
+      required: true
+  - type: input
+    id: java
+    attributes:
+      label: Java version
+      description: Output of `java -version`
+      placeholder: 11/17/21
+    validations:
+      required: true
+  - type: input
+    id: environment
+    attributes:
+      label: Environment
+      description: OS, build tool, Docker/Testcontainers, etc.
+      placeholder: macOS 13, Maven 3.9.x, Docker 24.x
+  - type: textarea
+    id: what-happened
+    attributes:
+      label: What happened?
+      description: Also tell us what you expected to happen
+    validations:
+      required: true
+  - type: textarea
+    id: steps
+    attributes:
+      label: Steps to reproduce
+      description: Be as specific as possible
+      placeholder: |
+        1. Setup ...
+        2. Run ...
+        3. Observe ...
+    validations:
+      required: true
+  - type: textarea
+    id: logs
+    attributes:
+      label: Relevant logs/stack traces
+      render: shell
+  - type: textarea
+    id: additional
+    attributes:
+      label: Additional context
+
--- a/.github/ISSUE_TEMPLATE/config.yml
+++ b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1,6 @@
+blank_issues_enabled: false
+contact_links:
+  - name: Questions / Discussions
+    url: https://github.com/ollama4j/ollama4j/discussions
+    about: Ask questions and discuss ideas here
+
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,31 @@
+name: Feature request
+description: Suggest an idea or enhancement
+labels: [enhancement]
+assignees: []
+body:
+  - type: markdown
+    attributes:
+      value: |
+        Thanks for suggesting an improvement!
+  - type: textarea
+    id: problem
+    attributes:
+      label: Is your feature request related to a problem?
+      description: A clear and concise description of the problem
+      placeholder: I'm frustrated when...
+  - type: textarea
+    id: solution
+    attributes:
+      label: Describe the solution you'd like
+      placeholder: I'd like...
+    validations:
+      required: true
+  - type: textarea
+    id: alternatives
+    attributes:
+      label: Describe alternatives you've considered
+  - type: textarea
+    id: context
+    attributes:
+      label: Additional context
+
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -0,0 +1,34 @@
+## Description
+
+Describe what this PR does and why.
+
+## Type of change
+
+- [ ] feat: New feature
+- [ ] fix: Bug fix
+- [ ] docs: Documentation update
+- [ ] refactor: Refactoring
+- [ ] test: Tests only
+- [ ] build/ci: Build or CI changes
+
+## How has this been tested?
+
+Explain the testing done. Include commands, screenshots, logs.
+
+## Checklist
+
+- [ ] I ran `pre-commit run -a` locally
+- [ ] `make build` succeeds locally
+- [ ] Unit/integration tests added or updated as needed
+- [ ] Docs updated (README/docs site) if user-facing changes
+- [ ] PR title follows Conventional Commits
+
+## Breaking changes
+
+List any breaking changes and migration notes.
+
+## Related issues
+
+Fixes #
+
+
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@@ -0,0 +1,34 @@
+# To get started with Dependabot version updates, you'll need to specify which
+## package ecosystems to update and where the package manifests are located.
+## Please see the documentation for all configuration options:
+## https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
+#
+#version: 2
+#updates:
+#  - package-ecosystem: "" # See documentation for possible values
+#    directory: "/" # Location of package manifests
+#    schedule:
+#      interval: "weekly"
+
+
+version: 2
+updates:
+  - package-ecosystem: "maven"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 5
+    labels: ["dependencies"]
+  - package-ecosystem: "github-actions"
+    directory: "/"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 5
+    labels: ["dependencies"]
+  - package-ecosystem: "npm"
+    directory: "/docs"
+    schedule:
+      interval: "weekly"
+    open-pull-requests-limit: 5
+    labels: ["dependencies"]
+#
--- a/.github/workflows/build-on-pr-create.yml
+++ b/.github/workflows/build-on-pr-create.yml
@@ -1,34 +0,0 @@
-# This workflow will build a package using Maven and then publish it to GitHub packages when a release is created
-# For more information see: https://github.com/actions/setup-java/blob/main/docs/advanced-usage.md#apache-maven-with-a-settings-path
-
-name: Build on PR Create
-
-on:
-  pull_request:
-    types: [ opened, reopened ]
-    branches: [ "main" ]
-
-
-jobs:
-  build:
-
-    runs-on: ubuntu-latest
-    permissions:
-      contents: read
-      packages: write
-
-    steps:
-      - uses: actions/checkout@v3
-      - name: Set up JDK 11
-        uses: actions/setup-java@v3
-        with:
-          java-version: '11'
-          distribution: 'adopt-hotspot'
-          server-id: github # Value of the distributionManagement/repository/id field of the pom.xml
-          settings-path: ${{ github.workspace }} # location for the settings.xml file
-
-      - name: Build with Maven
-        run: mvn --file pom.xml -U clean package
-
-      - name: Run Tests
-        run: mvn --file pom.xml -U clean test -Punit-tests
--- a/.github/workflows/build-on-pull-request.yml
+++ b/.github/workflows/build-on-pull-request.yml
@@ -0,0 +1,59 @@
+name: Build and Test on Pull Request
+
+on:
+  pull_request:
+    types: [opened, reopened, synchronize]
+    branches:
+      - main
+    paths:
+      - 'src/**'
+      - 'pom.xml'
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  build:
+    name: Build Java Project
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+
+    environment:
+      name: github-pages
+      url: ${{ steps.deployment.outputs.page_url }}
+
+    steps:
+      - uses: actions/checkout@v5
+      - name: Set up JDK 21
+        uses: actions/setup-java@v5
+        with:
+          java-version: '21'
+          distribution: 'oracle'
+          server-id: github
+          settings-path: ${{ github.workspace }}
+
+      - name: Build with Maven
+        run: mvn --file pom.xml -U clean package
+
+  run-tests:
+    name: Run Unit and Integration Tests
+    needs: build
+    uses: ./.github/workflows/run-tests.yml
+    with:
+      branch: ${{ github.head_ref || github.ref_name }}
+
+  build-docs:
+    name: Build Documentation
+    needs: [build, run-tests]
+    runs-on: ubuntu-latest
+
+    steps:
+      - uses: actions/checkout@v5
+      - name: Use Node.js
+        uses: actions/setup-node@v5
+        with:
+          node-version: '20.x'
+      - run: cd docs && npm ci
+      - run: cd docs && npm run build
--- a/.github/workflows/codeql.yml
+++ b/.github/workflows/codeql.yml
@@ -0,0 +1,44 @@
+name: CodeQL
+
+on:
+  push:
+    branches: [ main ]
+  pull_request:
+    branches: [ main ]
+  schedule:
+    - cron: '0 3 * * 1'
+
+jobs:
+  analyze:
+    name: Analyze
+    runs-on: ubuntu-latest
+    permissions:
+      actions: read
+      contents: read
+      security-events: write
+    strategy:
+      fail-fast: false
+      matrix:
+        language: [ 'java', 'javascript' ]
+    steps:
+      - name: Checkout repository
+        uses: actions/checkout@v5
+
+      - name: Set up JDK
+        if: matrix.language == 'java'
+        uses: actions/setup-java@v5
+        with:
+          distribution: oracle
+          java-version: '21'
+
+      - name: Initialize CodeQL
+        uses: github/codeql-action/init@v3
+        with:
+          languages: ${{ matrix.language }}
+
+      - name: Autobuild
+        uses: github/codeql-action/autobuild@v3
+
+      - name: Perform CodeQL Analysis
+        uses: github/codeql-action/analyze@v3
+
--- a/.github/workflows/gh-mvn-publish.yml
+++ b/.github/workflows/gh-mvn-publish.yml
@@ -13,12 +13,12 @@ jobs:
      packages: write

    steps:
-      - uses: actions/checkout@v3
-      - name: Set up JDK 17
-        uses: actions/setup-java@v3
+      - uses: actions/checkout@v5
+      - name: Set up JDK 21
+        uses: actions/setup-java@v5
        with:
-          java-version: '17'
-          distribution: 'temurin'
+          java-version: '21'
+          distribution: 'oracle'
          server-id: github
          settings-path: ${{ github.workspace }}

--- a/.github/workflows/label-issue-stale.yml
+++ b/.github/workflows/label-issue-stale.yml
@@ -0,0 +1,24 @@
+name: Mark stale issues
+
+on:
+  workflow_dispatch: # for manual run
+  schedule:
+    - cron: '0 0 * * *'  # Runs every day at midnight
+
+permissions:
+  contents: write # only for delete-branch option
+  issues: write
+
+jobs:
+  stale:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Mark stale issues
+        uses: actions/stale@v10
+        with:
+          repo-token: ${{ github.token }}
+          days-before-stale: 15
+          stale-issue-message: 'This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.'
+          days-before-close: 7
+          stale-issue-label: 'stale'
+          exempt-issue-labels: 'pinned,security'
--- a/.github/workflows/maven-publish.yml
+++ b/.github/workflows/maven-publish.yml
@@ -24,13 +24,13 @@ jobs:
      packages: write

    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v5

-      - name: Set up JDK 17
-        uses: actions/setup-java@v3
+      - name: Set up JDK 21
+        uses: actions/setup-java@v5
        with:
-          java-version: '17'
-          distribution: 'temurin'
+          java-version: '21'
+          distribution: 'oracle'
          server-id: github # Value of the distributionManagement/repository/id field of the pom.xml
          settings-path: ${{ github.workspace }} # location for the settings.xml file

--- a/.github/workflows/pre-commit.yml
+++ b/.github/workflows/pre-commit.yml
@@ -0,0 +1,30 @@
+name: Pre-commit Check on PR
+
+on:
+  pull_request:
+    types: [opened, reopened, synchronize]
+    branches:
+      - main
+
+#on:
+#  pull_request:
+#    branches: [ main ]
+#  push:
+#    branches: [ main ]
+
+jobs:
+  run:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v5
+      - uses: actions/setup-python@v6
+        with:
+          python-version: '3.x'
+      - name: Install pre-commit
+        run: |
+          python -m pip install --upgrade pip
+          pip install pre-commit
+      # - name: Run pre-commit
+      #   run: |
+      #     pre-commit run --all-files --show-diff-on-failure
+
--- a/.github/workflows/publish-docs.yml
+++ b/.github/workflows/publish-docs.yml
@@ -1,5 +1,5 @@
 # Simple workflow for deploying static content to GitHub Pages
-name: Deploy Docs to GH Pages
+name: Publish Docs to GH Pages

 on:
  release:
@@ -29,18 +29,18 @@ jobs:
      name: github-pages
      url: ${{ steps.deployment.outputs.page_url }}
    steps:
-      - uses: actions/checkout@v3
-      - name: Set up JDK 11
-        uses: actions/setup-java@v3
+      - uses: actions/checkout@v5
+      - name: Set up JDK 21
+        uses: actions/setup-java@v5
        with:
-          java-version: '11'
-          distribution: 'adopt-hotspot'
+          java-version: '21'
+          distribution: 'oracle'
          server-id: github # Value of the distributionManagement/repository/id field of the pom.xml
          settings-path: ${{ github.workspace }} # location for the settings.xml file

-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v5
      - name: Use Node.js
-        uses: actions/setup-node@v3
+        uses: actions/setup-node@v5
        with:
          node-version: '20.x'
      - run: cd docs && npm ci
@@ -57,18 +57,18 @@ jobs:
        run: mvn --file pom.xml -U clean package && cp -r ./target/apidocs/. ./docs/build/apidocs

      - name: Doxygen Action
-        uses: mattnotmitt/doxygen-action@v1.1.0
+        uses: mattnotmitt/doxygen-action@v1.12.0
        with:
          doxyfile-path: "./Doxyfile"
          working-directory: "."

      - name: Setup Pages
-        uses: actions/configure-pages@v3
+        uses: actions/configure-pages@v5
      - name: Upload artifact
-        uses: actions/upload-pages-artifact@v2
+        uses: actions/upload-pages-artifact@v4
        with:
          # Upload entire repository
          path: './docs/build/.'
      - name: Deploy to GitHub Pages
        id: deployment
-        uses: actions/deploy-pages@v2
+        uses: actions/deploy-pages@v4
--- a/.github/workflows/run-tests.yml
+++ b/.github/workflows/run-tests.yml
@@ -0,0 +1,54 @@
+name: Run Tests
+
+on:
+  #  push:
+  #    branches:
+  #      - main
+
+  workflow_call:
+    inputs:
+      branch:
+        description: 'Branch name to run the tests on'
+        required: true
+        default: 'main'
+        type: string
+
+  workflow_dispatch:
+    inputs:
+      branch:
+        description: 'Branch name to run the tests on'
+        required: true
+        default: 'main'
+        type: string
+
+jobs:
+  run-tests:
+    name: Unit and Integration Tests
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout target branch
+        uses: actions/checkout@v5
+        with:
+          ref: ${{ github.event.inputs.branch }}
+
+      - name: Set up Ollama
+        run: |
+          curl -fsSL https://ollama.com/install.sh | sh
+
+      - name: Set up JDK 21
+        uses: actions/setup-java@v5
+        with:
+          java-version: '21'
+          distribution: 'oracle'
+          server-id: github
+          settings-path: ${{ github.workspace }}
+
+      - name: Run unit tests
+        run: make unit-tests
+
+      - name: Run integration tests
+        run: make integration-tests-basic
+        env:
+          USE_EXTERNAL_OLLAMA_HOST: "true"
+          OLLAMA_HOST: "http://localhost:11434"
--- a/.github/workflows/stale.yml
+++ b/.github/workflows/stale.yml
@@ -0,0 +1,33 @@
+name: Mark stale issues and PRs
+
+on:
+  schedule:
+    - cron: '0 2 * * *'
+
+permissions:
+  issues: write
+  pull-requests: write
+
+jobs:
+  stale:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/stale@v10
+        with:
+          days-before-stale: 60
+          days-before-close: 14
+          stale-issue-label: 'stale'
+          stale-pr-label: 'stale'
+          exempt-issue-labels: 'pinned,security'
+          exempt-pr-labels: 'pinned,security'
+          stale-issue-message: >
+            This issue has been automatically marked as stale because it has not had
+            recent activity. It will be closed if no further activity occurs.
+          close-issue-message: >
+            Closing this stale issue. Feel free to reopen if this is still relevant.
+          stale-pr-message: >
+            This pull request has been automatically marked as stale due to inactivity.
+            It will be closed if no further activity occurs.
+          close-pr-message: >
+            Closing this stale pull request. Please reopen when you're ready to continue.
+
--- a/.gitignore
+++ b/.gitignore
@@ -41,4 +41,4 @@ pom.xml.*
 release.properties
 !.idea/icon.svg

-src/main/java/io/github/ollama4j/localtests
+src/main/java/io/github/ollama4j/localtests
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -0,0 +1,46 @@
+repos:
+
+  # pre-commit hooks
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: "v6.0.0"
+    hooks:
+      - id: no-commit-to-branch
+        args: ['--branch', 'main']
+      - id: check-merge-conflict
+      - id: check-added-large-files
+      - id: check-yaml
+      - id: check-xml
+      - id: check-json
+      - id: pretty-format-json
+        args: ['--no-sort-keys', '--autofix', '--indent=4']
+      - id: end-of-file-fixer
+        exclude: \.json$
+        files: \.java$|\.xml$
+      - id: trailing-whitespace
+      - id: mixed-line-ending
+
+  # for commit message formatting
+  - repo: https://github.com/commitizen-tools/commitizen
+    rev: v4.9.1
+    hooks:
+      - id: commitizen
+        stages: [commit-msg]
+
+  - repo: local
+    hooks:
+      - id: format-code
+        name: Format Code
+        entry: make apply-formatting
+        language: system
+        always_run: true
+
+#  # for java code quality
+#  - repo: https://github.com/gherynos/pre-commit-java
+#    rev: v0.6.10
+#    hooks:
+#      - id: pmd
+#        exclude: /test/
+#      - id: cpd
+#        exclude: /test/
+#      - id: checkstyle
+#        exclude: /test/
--- a/CITATION.cff
+++ b/CITATION.cff
@@ -0,0 +1,9 @@
+cff-version: 1.2.0
+message: "If you use this software, please cite it as below."
+authors:
+  - family-names: "Koujalgi"
+    given-names: "Amith"
+title: "Ollama4j: A Java Library (Wrapper/Binding) for Ollama Server"
+version: "1.1.0"
+date-released: 2023-12-19
+url: "https://github.com/ollama4j/ollama4j"
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -0,0 +1,125 @@
+## Contributing to Ollama4j
+
+Thanks for your interest in contributing! This guide explains how to set up your environment, make changes, and submit pull requests.
+
+### Code of Conduct
+
+By participating, you agree to abide by our [Code of Conduct](CODE_OF_CONDUCT.md).
+
+### Quick Start
+
+Prerequisites:
+
+- Java 11+
+- Maven 3.8+
+- Docker (required for integration tests)
+- Make (for convenience targets)
+- pre-commit (for Git hooks)
+
+Setup:
+
+```bash
+# 1) Fork the repo and clone your fork
+git clone https://github.com/<your-username>/ollama4j.git
+cd ollama4j
+
+# 2) Install and enable git hooks
+pre-commit install --hook-type pre-commit --hook-type commit-msg
+
+# 3) Prepare dev environment (installs husk deps/tools if needed)
+make dev
+```
+
+Build and test:
+
+```bash
+# Build
+make build
+
+# Run unit tests
+make unit-tests
+
+# Run integration tests (requires Docker running)
+make integration-tests
+```
+
+If you prefer raw Maven:
+
+```bash
+# Unit tests profile
+mvn -P unit-tests clean test
+
+# Integration tests profile (Docker required)
+mvn -P integration-tests -DskipUnitTests=true clean verify
+```
+
+### Commit Style
+
+We use Conventional Commits. Commit messages and PR titles should follow:
+
+```
+<type>(optional scope): <short summary>
+
+[optional body]
+[optional footer(s)]
+```
+
+Common types: `feat`, `fix`, `docs`, `refactor`, `test`, `build`, `chore`.
+
+Commit message formatting is enforced via `commitizen` through `pre-commit` hooks.
+
+### Pre-commit Hooks
+
+Before pushing, run:
+
+```bash
+pre-commit run -a
+```
+
+Hooks will check for merge conflicts, large files, YAML/XML/JSON validity, line endings, and basic formatting. Fix reported issues before opening a PR.
+
+### Coding Guidelines
+
+- Target Java 11+; match existing style and formatting.
+- Prefer clear, descriptive names over abbreviations.
+- Add Javadoc for public APIs and non-obvious logic.
+- Include meaningful tests for new features and bug fixes.
+- Avoid introducing new dependencies without discussion.
+
+### Tests
+
+- Unit tests: place under `src/test/java/**/unittests/`.
+- Integration tests: place under `src/test/java/**/integrationtests/` (uses Testcontainers; ensure Docker is running).
+
+### Documentation
+
+- Update `README.md`, Javadoc, and `docs/` when you change public APIs or user-facing behavior.
+- Add example snippets where useful. Keep API references consistent with the website content when applicable.
+
+### Pull Requests
+
+Before opening a PR:
+
+- Ensure `make build` and all tests pass locally.
+- Run `pre-commit run -a` and fix any issues.
+- Keep PRs focused and reasonably small. Link related issues (e.g., "Closes #123").
+- Describe the change, rationale, and any trade-offs in the PR description.
+
+Review process:
+
+- Maintainers will review for correctness, scope, tests, and docs.
+- You may be asked to iterate; please be responsive to comments.
+
+### Security
+
+If you discover a security issue, please do not open a public issue. Instead, email the maintainer at `koujalgi.amith@gmail.com` with details.
+
+### License
+
+By contributing, you agree that your contributions will be licensed under the project’s [MIT License](LICENSE).
+
+### Questions and Discussion
+
+Have questions or ideas? Open a GitHub Discussion or issue. We welcome feedback and proposals!
+
+
--- a/2
+++ b/2
@@ -1,6 +1,6 @@
 MIT License

-Copyright (c) 2023 Amith Koujalgi
+Copyright (c) 2023 Amith Koujalgi and contributors

 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
--- a/81
+++ b/81
@@ -1,28 +1,79 @@
-build:
-	mvn -B clean install
+# Default target
+.PHONY: all
+all: dev build

-unit-tests:
-	mvn clean test -Punit-tests
+dev:
+	@echo "Setting up dev environment..."
+	@command -v pre-commit >/dev/null 2>&1 || { echo "Error: pre-commit is not installed. Please install it first."; exit 1; }
+	@command -v docker >/dev/null 2>&1 || { echo "Error: docker is not installed. Please install it first."; exit 1; }
+	@pre-commit install
+	@pre-commit autoupdate
+	@pre-commit install --install-hooks

-integration-tests:
-	mvn clean verify -Pintegration-tests
+check-formatting:
+	@echo "\033[0;34mChecking code formatting...\033[0m"
+	@mvn spotless:check
+
+apply-formatting:
+	@echo "\033[0;32mApplying code formatting...\033[0m"
+	@mvn spotless:apply
+
+build: apply-formatting
+	@echo "\033[0;34mBuilding project (GPG skipped)...\033[0m"
+	@mvn -B clean install -Dgpg.skip=true -Dmaven.javadoc.skip=true
+
+full-build: apply-formatting
+	@echo "\033[0;34mPerforming full build...\033[0m"
+	@mvn -B clean install
+
+unit-tests: apply-formatting
+	@echo "\033[0;34mRunning unit tests...\033[0m"
+	@mvn clean test -Punit-tests
+
+integration-tests-all: apply-formatting
+	@echo "\033[0;34mRunning integration tests (local - all)...\033[0m"
+	@export USE_EXTERNAL_OLLAMA_HOST=false && mvn clean verify -Pintegration-tests
+
+integration-tests-basic: apply-formatting
+	@echo "\033[0;34mRunning integration tests (local - basic)...\033[0m"
+	@export USE_EXTERNAL_OLLAMA_HOST=false && mvn clean verify -Pintegration-tests -Dit.test=WithAuth
+
+integration-tests-remote: apply-formatting
+	@echo "\033[0;34mRunning integration tests (remote - all)...\033[0m"
+	@export USE_EXTERNAL_OLLAMA_HOST=true && export OLLAMA_HOST=http://192.168.29.229:11434 && mvn clean verify -Pintegration-tests -Dgpg.skip=true

 doxygen:
-	doxygen Doxyfile
+	@echo "\033[0;34mGenerating documentation with Doxygen...\033[0m"
+	@doxygen Doxyfile
+
+javadoc:
+	@echo "\033[0;34mGenerating Javadocs...\033[0m"
+	@mvn clean javadoc:javadoc
+	@if [ -f "target/reports/apidocs/index.html" ]; then \
+		echo "\033[0;32mJavadocs generated in target/reports/apidocs/index.html\033[0m"; \
+	else \
+		echo "\033[0;31mFailed to generate Javadocs in target/reports/apidocs\033[0m"; \
+		exit 1; \
+	fi

 list-releases:
-	curl 'https://central.sonatype.com/api/internal/browse/component/versions?sortField=normalizedVersion&sortDirection=asc&page=0&size=12&filter=namespace%3Aio.github.amithkoujalgi%2Cname%3Aollama4j' \
+	@echo "\033[0;34mListing latest releases...\033[0m"
+	@curl 'https://central.sonatype.com/api/internal/browse/component/versions?sortField=normalizedVersion&sortDirection=desc&page=0&size=20&filter=namespace%3Aio.github.ollama4j%2Cname%3Aollama4j' \
      --compressed \
-      --silent | jq '.components[].version'
+      --silent | jq -r '.components[].version'

-build-docs:
-	npm i --prefix docs && npm run build --prefix docs
+docs-build:
+	@echo "\033[0;34mBuilding documentation site...\033[0m"
+	@cd ./docs && npm ci --no-audit --fund=false && npm run build

-start-docs:
-	npm i --prefix docs && npm run start --prefix docs
+docs-serve:
+	@echo "\033[0;34mServing documentation site...\033[0m"
+	@cd ./docs && npm install && npm run start

 start-cpu:
-	docker run -it -v ~/ollama:/root/.ollama -p 11434:11434 ollama/ollama
+	@echo "\033[0;34mStarting Ollama (CPU mode)...\033[0m"
+	@docker run -it -v ~/ollama:/root/.ollama -p 11434:11434 ollama/ollama

 start-gpu:
-	docker run -it --gpus=all -v ~/ollama:/root/.ollama -p 11434:11434 ollama/ollama
+	@echo "\033[0;34mStarting Ollama (GPU mode)...\033[0m"
+	@docker run -it --gpus=all -v ~/ollama:/root/.ollama -p 11434:11434 ollama/ollama
--- a/README.md
+++ b/README.md
@@ -1,28 +1,32 @@
+<div align="center">
+  <img src='https://raw.githubusercontent.com/ollama4j/ollama4j/refs/heads/main/ollama4j-new.jpeg' width='200' alt="ollama4j-icon">
+
 ### Ollama4j

-<p align="center">
-  <img src='https://raw.githubusercontent.com/ollama4j/ollama4j/65a9d526150da8fcd98e2af6a164f055572bf722/ollama4j.jpeg' width='100' alt="ollama4j-icon">
-</p>
-
-
-A Java library (wrapper/binding) for [Ollama](https://ollama.ai/) server.
-
-Find more details on the [website](https://ollama4j.github.io/ollama4j/).
+</div>

 <div align="center">
+A Java library (wrapper/binding) for Ollama server.
+
+_Find more details on the **[website](https://ollama4j.github.io/ollama4j/)**._

 ![GitHub stars](https://img.shields.io/github/stars/ollama4j/ollama4j)
 ![GitHub forks](https://img.shields.io/github/forks/ollama4j/ollama4j)
-![GitHub watchers](https://img.shields.io/github/watchers/ollama4j/ollama4j)
 ![Contributors](https://img.shields.io/github/contributors/ollama4j/ollama4j?style=social)
+
+![GitHub last commit](https://img.shields.io/github/last-commit/ollama4j/ollama4j?color=green)
 ![GitHub License](https://img.shields.io/github/license/ollama4j/ollama4j)

+[![Run Unit and Integration Tests](https://github.com/ollama4j/ollama4j/actions/workflows/run-tests.yml/badge.svg)](https://github.com/ollama4j/ollama4j/actions/workflows/run-tests.yml)
+![Sonar Quality Gate](https://img.shields.io/sonar/quality_gate/ollama4j_ollama4j?server=https%3A%2F%2Fsonarcloud.io&label=Sonar%20Quality%20Gate)
+[![codecov](https://codecov.io/gh/ollama4j/ollama4j/graph/badge.svg?token=U0TE7BGP8L)](https://codecov.io/gh/ollama4j/ollama4j)
+</div>
+

 [//]: # (![GitHub repo size]&#40;https://img.shields.io/github/repo-size/ollama4j/ollama4j&#41;)

 [//]: # (![GitHub top language]&#40;https://img.shields.io/github/languages/top/ollama4j/ollama4j&#41;)

-
 [//]: # (![JitPack Downloads This Month Badge]&#40;https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fjitpack.io%2Fapi%2Fdownloads%2Fio.github.ollama4j%2Follama4j&query=%24.month&label=JitPack%20Downloads%20-%20This%20Month&#41;)

 [//]: # (![JitPack Downloads This Week Badge]&#40;https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fjitpack.io%2Fapi%2Fdownloads%2Fio.github.ollama4j%2Follama4j&query=%24.week&label=JitPack%20Downloads%20-%20This%20Week&#41;)
@@ -31,28 +35,60 @@ Find more details on the [website](https://ollama4j.github.io/ollama4j/).

 [//]: # (![GitHub Downloads &#40;all assets, all releases&#41;]&#40;https://img.shields.io/github/downloads/ollama4j/ollama4j/total?label=GitHub%20Package%20Downloads&#41;)

-![GitHub last commit](https://img.shields.io/github/last-commit/ollama4j/ollama4j?color=green)
-[![codecov](https://codecov.io/gh/ollama4j/ollama4j/graph/badge.svg?token=U0TE7BGP8L)](https://codecov.io/gh/ollama4j/ollama4j)
-![Build Status](https://github.com/ollama4j/ollama4j/actions/workflows/maven-publish.yml/badge.svg)
-
-</div>
-
 [//]: # (![Hits]&#40;https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Follama4j%2Follama4j&count_bg=%2379C83D&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=hits&edge_flat=false&#41;)

 [//]: # (![GitHub language count]&#40;https://img.shields.io/github/languages/count/ollama4j/ollama4j&#41;)

 ## Table of Contents

+- [Capabilities](#capabilities)
 - [How does it work?](#how-does-it-work)
 - [Requirements](#requirements)
- [Installation](#installation)
- [API Spec](https://ollama4j.github.io/ollama4j/category/apis---model-management)
- [Javadoc](https://ollama4j.github.io/ollama4j/apidocs/)
+- [Usage](#usage)
+    - [For Maven](#for-maven)
+        - [Using Maven Central](#using-maven-central)
+        - [Using GitHub's Maven Package Repository](#using-githubs-maven-package-repository)
+    - [For Gradle](#for-gradle)
+- [API Spec](#api-spec)
+- [Examples](#examples)
 - [Development](#development)
- [Contributions](#get-involved)
- [References](#references)
+    - [Setup dev environment](#setup-dev-environment)
+    - [Build](#build)
+    - [Run unit tests](#run-unit-tests)
+    - [Run integration tests](#run-integration-tests)
+    - [Releases](#releases)
+- [Get Involved](#get-involved)
+- [Who's using Ollama4j?](#whos-using-ollama4j)
+- [Growth](#growth)
+    - [References](#references)
+    - [Credits](#credits)
+    - [Appreciate the work?](#appreciate-the-work)

-#### How does it work?
+## Capabilities
+
+- **Text generation**: Single-turn `generate` with optional streaming and advanced options
+- **Chat**: Multi-turn chat with conversation history and roles
+- **Tool/function calling**: Built-in tool invocation via annotations and tool specs
+- **Reasoning/thinking modes**: Generate and chat with “thinking” outputs where supported
+- **Image inputs (multimodal)**: Generate with images as inputs where models support vision
+- **Embeddings**: Create vector embeddings for text
+- **Async generation**: Fire-and-forget style generation APIs
+- **Custom roles**: Define and use custom chat roles
+- **Model management**: List, pull, create, delete, and get model details
+- **Connectivity utilities**: Server `ping` and process status (`ps`)
+- **Authentication**: Basic auth and bearer token support
+- **Options builder**: Type-safe builder for model parameters and request options
+- **Timeouts**: Configure connect/read/write timeouts
+- **Logging**: Built-in logging hooks for requests and responses
+- **Metrics & Monitoring** 🆕: Built-in Prometheus metrics export for real-time monitoring of requests, model usage, and
+  performance. *(Beta feature – feedback/contributions welcome!)* -
+  Checkout [ollama4j-examples](https://github.com/ollama4j/ollama4j-examples) repository for details.
+
+<div align="center">
+  <img src='metrics.png' width='100%' alt="ollama4j-icon">
+</div>
+
+## How does it work?

 ```mermaid
  flowchart LR
@@ -66,71 +102,16 @@ Find more details on the [website](https://ollama4j.github.io/ollama4j/).
    end
 ```

-#### Requirements
+## Requirements

-![Java](https://img.shields.io/badge/Java-11_+-green.svg?style=for-the-badge&labelColor=gray&label=Java&color=orange)
+<p align="center">
+  <img src="https://img.shields.io/badge/Java-11%2B-green.svg?style=for-the-badge&labelColor=gray&label=Java&color=orange" alt="Java"/>
+  <a href="https://ollama.com/" target="_blank">
+    <img src="https://img.shields.io/badge/Ollama-0.11.10+-blue.svg?style=for-the-badge&labelColor=gray&label=Ollama&color=blue" alt="Ollama"/>
+  </a>
+</p>

-
-<a href="https://ollama.com/" target="_blank">
-  <img src="https://img.shields.io/badge/v0.3.0-green.svg?style=for-the-badge&labelColor=gray&label=Ollama&color=blue" alt=""/>
-</a>
-
-<table>
-<tr>
-<td> 
-
-<a href="https://ollama.ai/" target="_blank">Local Installation</a>
-
-</td> 
-
-<td> 
-
-<a href="https://hub.docker.com/r/ollama/ollama" target="_blank">Docker Installation</a>
-
-</td>
-</tr>
-<tr>
-<td>
-
-<a href="https://ollama.com/download/Ollama-darwin.zip" target="_blank">Download for macOS</a>
-
-<a href="https://ollama.com/download/OllamaSetup.exe" target="_blank">Download for Windows</a>
-
-Install on Linux
-
-```shell 
-curl -fsSL https://ollama.com/install.sh | sh
-```
-
-</td>
-<td>
-
-
-
-CPU only
-
-```shell
-docker run -d -p 11434:11434 \
-  -v ollama:/root/.ollama \
-  --name ollama \
-  ollama/ollama
-```
-
-NVIDIA GPU
-
-```shell
-docker run -d -p 11434:11434 \
-  --gpus=all \
-  -v ollama:/root/.ollama \
-  --name ollama \
-  ollama/ollama
-```
-
-</td>
-</tr>
-</table>
-
-## Installation
+## Usage

 > [!NOTE]
 > We are now publishing the artifacts to both Maven Central and GitHub package repositories.
@@ -155,7 +136,7 @@ In your Maven project, add this dependency:
 <dependency>
    <groupId>io.github.ollama4j</groupId>
    <artifactId>ollama4j</artifactId>
-    <version>1.0.79</version>
+    <version>1.1.0</version>
 </dependency>
 ```

@@ -211,7 +192,7 @@ In your Maven project, add this dependency:
 <dependency>
    <groupId>io.github.ollama4j</groupId>
    <artifactId>ollama4j</artifactId>
-    <version>1.0.79</version>
+    <version>1.1.0</version>
 </dependency>
 ```

@@ -221,7 +202,7 @@ In your Maven project, add this dependency:

 ```groovy
 dependencies {
-    implementation 'io.github.ollama4j:ollama4j:1.0.79'
+    implementation 'io.github.ollama4j:ollama4j:1.1.0'
 }
 ```

@@ -239,56 +220,69 @@ dependencies {

 [lib-shield]: https://img.shields.io/badge/ollama4j-get_latest_version-blue.svg?style=just-the-message&labelColor=gray

-#### API Spec
+### API Spec

 > [!TIP]
 > Find the full API specifications on the [website](https://ollama4j.github.io/ollama4j/).

-#### Development
+## Examples

-Build:
+For practical examples and usage patterns of the Ollama4j library, check out
+the [ollama4j-examples](https://github.com/ollama4j/ollama4j-examples) repository.
+
+## Development
+
+Make sure you have `pre-commit` installed.
+
+With `brew`:
+
+```shell
+brew install pre-commit
+```
+
+With `pip`:
+
+```shell
+pip install pre-commit
+```
+
+#### Setup dev environment
+
+> **Note**
+> If you're on Windows, install [Chocolatey Package Manager for Windows](https://chocolatey.org/install) and then
+> install `make` by running `choco install make`. Just a little tip - run the command with administrator privileges if
+> installation faiils.
+
+```shell
+make dev
+```
+
+#### Build

 ```shell
 make build
 ```

-Run unit tests:
+#### Run unit tests

 ```shell
 make unit-tests
 ```

-Run integration tests:
+#### Run integration tests
+
+Make sure you have Docker running as this uses [testcontainers](https://testcontainers.com/) to run the integration
+tests on Ollama Docker container.

 ```shell
 make integration-tests
 ```

-#### Releases
+### Releases

 Newer artifacts are published via GitHub Actions CI workflow when a new release is created from `main` branch.

-#### Who's using Ollama4j?
-
- `Datafaker`: a library to generate fake data
-    - https://github.com/datafaker-net/datafaker-experimental/tree/main/ollama-api
- `Vaadin Web UI`: UI-Tester for Interactions with Ollama via ollama4j
-    - https://github.com/TEAMPB/ollama4j-vaadin-ui
- `ollama-translator`: Minecraft 1.20.6 spigot plugin allows to easily break language barriers by using ollama on the
-  server to translate all messages into a specfic target language.
-    - https://github.com/liebki/ollama-translator
-    - https://www.reddit.com/r/fabricmc/comments/1e65x5s/comment/ldr2vcf/
- `Ollama4j Web UI`: A web UI for Ollama written in Java using Spring Boot and Vaadin framework and
-  Ollama4j.
-    - https://github.com/ollama4j/ollama4j-web-ui
- `JnsCLI`: A command-line tool for Jenkins that manages jobs, builds, and configurations directly from the terminal while offering AI-powered error analysis for quick troubleshooting.
-    -  https://github.com/mirum8/jnscli
-
-#### Traction
-
-[![Star History Chart](https://api.star-history.com/svg?repos=ollama4j/ollama4j&type=Date)](https://star-history.com/#ollama4j/ollama4j&Date)
-
-### Get Involved
+## Get Involved

 <div align="center">

@@ -300,6 +294,40 @@ Newer artifacts are published via GitHub Actions CI workflow when a new release

 </div>

+Contributions are most welcome! Whether it's reporting a bug, proposing an enhancement, or helping
+with code - any sort of contribution is much appreciated.
+
+<div style="font-size: 15px; font-weight: bold; padding-top: 10px; padding-bottom: 10px; border: 1px solid" align="center">
+If you like or are use this project, please give us a ⭐. It's a free way to show your support.
+</div>
+
+## Who's using Ollama4j?
+
+| #  | Project Name      | Description                                                                                                                                                        | Link                                                                                                                                                                                          |
+|----|-------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| 1  | Datafaker         | A library to generate fake data                                                                                                                                    | [GitHub](https://github.com/datafaker-net/datafaker-experimental/tree/main/ollama-api)                                                                                                        |
+| 2  | Vaadin Web UI     | UI-Tester for interactions with Ollama via ollama4j                                                                                                                | [GitHub](https://github.com/TEAMPB/ollama4j-vaadin-ui)                                                                                                                                        |
+| 3  | ollama-translator | A Minecraft 1.20.6 Spigot plugin that translates all messages into a specific target language via Ollama                                                           | [GitHub](https://github.com/liebki/ollama-translator)                                                                                                                                         |
+| 4  | AI Player         | A Minecraft mod that adds an intelligent "second player" to the game                                                                                               | [Website](https://modrinth.com/mod/ai-player), [GitHub](https://github.com/shasankp000/AI-Player), <br/> [Reddit Thread](https://www.reddit.com/r/fabricmc/comments/1e65x5s/comment/ldr2vcf/) |
+| 5  | Ollama4j Web UI   | A web UI for Ollama written in Java using Spring Boot, Vaadin, and Ollama4j                                                                                        | [GitHub](https://github.com/ollama4j/ollama4j-web-ui)                                                                                                                                         |
+| 6  | JnsCLI            | A command-line tool for Jenkins that manages jobs, builds, and configurations, with AI-powered error analysis                                                      | [GitHub](https://github.com/mirum8/jnscli)                                                                                                                                                    |
+| 7  | Katie Backend     | An open-source AI-based question-answering platform for accessing private domain knowledge                                                                         | [GitHub](https://github.com/wyona/katie-backend)                                                                                                                                              |
+| 8  | TeleLlama3 Bot    | A question-answering Telegram bot                                                                                                                                  | [Repo](https://git.hiast.edu.sy/mohamadbashar.disoki/telellama3-bot)                                                                                                                          |
+| 9  | moqui-wechat      | A moqui-wechat component                                                                                                                                           | [GitHub](https://github.com/heguangyong/moqui-wechat)                                                                                                                                         |
+| 10 | B4X               | A set of simple and powerful RAD tool for Desktop and Server development                                                                                           | [Website](https://www.b4x.com/android/forum/threads/ollama4j-library-pnd_ollama4j-your-local-offline-llm-like-chatgpt.165003/)                                                                |
+| 11 | Research Article  | Article: `Large language model based mutations in genetic improvement` - published on National Library of Medicine (National Center for Biotechnology Information) | [Website](https://pmc.ncbi.nlm.nih.gov/articles/PMC11750896/)                                                                                                                                 |
+| 12 | renaime           | A LLaVa powered tool that automatically renames image files having messy file names.                                                                               | [Website](https://devpost.com/software/renaime)                                                                                                                                               |
+
+## Growth
+
+<p align="center">
+  <a href="https://star-history.com/#ollama4j/ollama4j&Date" target="_blank" rel="noopener noreferrer">
+    <img
+      src="https://api.star-history.com/svg?repos=ollama4j/ollama4j&type=Date"
+      alt="Star History Chart"
+    />
+  </a>
+</p>

 [//]: # (![GitHub Issues or Pull Requests]&#40;https://img.shields.io/github/issues-raw/ollama4j/ollama4j&#41;)

@@ -311,11 +339,6 @@ Newer artifacts are published via GitHub Actions CI workflow when a new release

 [//]: # (![GitHub Discussions]&#40;https://img.shields.io/github/discussions/ollama4j/ollama4j&#41;)

-
-Contributions are most welcome! Whether it's reporting a bug, proposing an enhancement, or helping
-with code - any sort
-of contribution is much appreciated.
-
 ### References

 - [Ollama REST APIs](https://github.com/jmorganca/ollama/blob/main/docs/api.md)
@@ -329,11 +352,11 @@ project.

 <p align="center">
  <a href="https://github.com/ollama4j/ollama4j/graphs/contributors">
-    <img src="https://contrib.rocks/image?repo=ollama4j/ollama4j"  alt=""/>
+    <img src="https://contrib.rocks/image?repo=ollama4j/ollama4j" alt=""/>
  </a>
 </p>

-### Appreciate my work?
+### Appreciate the work?

 <p align="center">
  <a href="https://www.buymeacoffee.com/amithkoujalgi" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -0,0 +1,39 @@
+## Security Policy
+
+### Supported Versions
+
+We aim to support the latest released version of `ollama4j` and the most recent minor version prior to it. Older versions may receive fixes on a best-effort basis.
+
+### Reporting a Vulnerability
+
+Please do not open public GitHub issues for security vulnerabilities.
+
+Instead, email the maintainer at:
+
+```
+koujalgi.amith@gmail.com
+```
+
+Include as much detail as possible:
+
+- A clear description of the issue and impact
+- Steps to reproduce or proof-of-concept
+- Affected version(s) and environment
+- Any suggested mitigations or patches
+
+You should receive an acknowledgement within 72 hours. We will work with you to validate the issue, determine severity, and prepare a fix.
+
+### Disclosure
+
+We follow a responsible disclosure process:
+
+1. Receive and validate report privately.
+2. Develop and test a fix.
+3. Coordinate a release that includes the fix.
+4. Publicly credit the reporter (if desired) in release notes.
+
+### GPG Signatures
+
+Releases may be signed as part of our CI pipeline. If verification fails or you have concerns about release integrity, please contact us via the email above.
+
+
--- a/docs/METRICS.md
+++ b/docs/METRICS.md
@@ -0,0 +1,186 @@
+# Prometheus Metrics Integration
+
+Ollama4j now includes comprehensive Prometheus metrics collection to help you monitor and observe your Ollama API usage. This feature allows you to track request counts, response times, model usage, and other operational metrics.
+
+## Features
+
+The metrics integration provides the following metrics:
+
+- **Request Metrics**: Total requests, duration histograms, and response time summaries by endpoint
+- **Model Usage**: Model-specific usage statistics and response times
+- **Token Generation**: Token count tracking per model
+- **Error Tracking**: Error counts by type and endpoint
+- **Active Connections**: Current number of active API connections
+
+## Quick Start
+
+### 1. Enable Metrics Collection
+
+```java
+import io.github.ollama4j.Ollama;
+
+// Create API instance with metrics enabled
+Ollama ollama = new Ollama();
+ollamaAPI.
+
+setMetricsEnabled(true);
+```
+
+### 2. Start Metrics Server
+
+```java
+import io.prometheus.client.exporter.HTTPServer;
+
+// Start Prometheus metrics HTTP server on port 8080
+HTTPServer metricsServer = new HTTPServer(8080);
+System.out.println("Metrics available at: http://localhost:8080/metrics");
+```
+
+### 3. Use the API (Metrics are automatically collected)
+
+```java
+// All API calls are automatically instrumented
+boolean isReachable = ollama.ping();
+
+Map<String, Object> format = new HashMap<>();
+format.put("type", "json");
+OllamaResult result = ollama.generateWithFormat(
+    "llama2",
+    "Generate a JSON object",
+    format
+);
+```
+
+## Available Metrics
+
+### Request Metrics
+
+- `ollama_api_requests_total` - Total number of API requests by endpoint, method, and status
+- `ollama_api_request_duration_seconds` - Request duration histogram by endpoint and method
+- `ollama_api_response_time_seconds` - Response time summary with percentiles
+
+### Model Metrics
+
+- `ollama_model_usage_total` - Model usage count by model name and operation
+- `ollama_model_response_time_seconds` - Model response time histogram
+- `ollama_tokens_generated_total` - Total tokens generated by model
+
+### System Metrics
+
+- `ollama_api_active_connections` - Current number of active connections
+- `ollama_api_errors_total` - Error count by endpoint and error type
+
+## Example Metrics Output
+
+```
+# HELP ollama_api_requests_total Total number of Ollama API requests
+# TYPE ollama_api_requests_total counter
+ollama_api_requests_total{endpoint="/api/generate",method="POST",status="success"} 5.0
+ollama_api_requests_total{endpoint="/api/embed",method="POST",status="success"} 3.0
+
+# HELP ollama_api_request_duration_seconds Duration of Ollama API requests in seconds
+# TYPE ollama_api_request_duration_seconds histogram
+ollama_api_request_duration_seconds_bucket{endpoint="/api/generate",method="POST",le="0.1"} 0.0
+ollama_api_request_duration_seconds_bucket{endpoint="/api/generate",method="POST",le="0.5"} 2.0
+ollama_api_request_duration_seconds_bucket{endpoint="/api/generate",method="POST",le="1.0"} 4.0
+ollama_api_request_duration_seconds_bucket{endpoint="/api/generate",method="POST",le="+Inf"} 5.0
+ollama_api_request_duration_seconds_sum{endpoint="/api/generate",method="POST"} 2.5
+ollama_api_request_duration_seconds_count{endpoint="/api/generate",method="POST"} 5.0
+
+# HELP ollama_model_usage_total Total number of model usage requests
+# TYPE ollama_model_usage_total counter
+ollama_model_usage_total{model_name="llama2",operation="generate_with_format"} 5.0
+ollama_model_usage_total{model_name="llama2",operation="embed"} 3.0
+
+# HELP ollama_tokens_generated_total Total number of tokens generated
+# TYPE ollama_tokens_generated_total counter
+ollama_tokens_generated_total{model_name="llama2"} 150.0
+```
+
+## Configuration
+
+### Enable/Disable Metrics
+
+```java
+OllamaAPI ollama = new OllamaAPI();
+
+// Enable metrics collection
+ollama.setMetricsEnabled(true);
+
+// Disable metrics collection (default)
+ollama.setMetricsEnabled(false);
+```
+
+### Custom Metrics Server
+
+```java
+import io.prometheus.client.exporter.HTTPServer;
+
+// Start on custom port
+HTTPServer metricsServer = new HTTPServer(9090);
+
+// Start on custom host and port
+HTTPServer metricsServer = new HTTPServer("0.0.0.0", 9090);
+```
+
+## Integration with Prometheus
+
+### Prometheus Configuration
+
+Add this to your `prometheus.yml`:
+
+```yaml
+scrape_configs:
+  - job_name: 'ollama4j'
+    static_configs:
+      - targets: ['localhost:8080']
+    scrape_interval: 15s
+```
+
+### Grafana Dashboards
+
+You can create Grafana dashboards using the metrics. Some useful queries:
+
+- **Request Rate**: `rate(ollama_api_requests_total[5m])`
+- **Average Response Time**: `rate(ollama_api_request_duration_seconds_sum[5m]) / rate(ollama_api_request_duration_seconds_count[5m])`
+- **Error Rate**: `rate(ollama_api_requests_total{status="error"}[5m]) / rate(ollama_api_requests_total[5m])`
+- **Model Usage**: `rate(ollama_model_usage_total[5m])`
+- **Token Generation Rate**: `rate(ollama_tokens_generated_total[5m])`
+
+## Performance Considerations
+
+- Metrics collection adds minimal overhead (~1-2% in most cases)
+- Metrics are collected asynchronously and don't block API calls
+- You can disable metrics in production if needed: `ollama.setMetricsEnabled(false)`
+- The metrics server uses minimal resources
+
+## Troubleshooting
+
+### Metrics Not Appearing
+
+1. Ensure metrics are enabled: `ollama.setMetricsEnabled(true)`
+2. Check that the metrics server is running: `http://localhost:8080/metrics`
+3. Verify API calls are being made (metrics only appear after API usage)
+
+### High Memory Usage
+
+- Metrics accumulate over time. Consider restarting your application periodically
+- Use Prometheus to scrape metrics regularly to avoid accumulation
+
+### Custom Metrics
+
+You can extend the metrics by accessing the Prometheus registry directly:
+
+```java
+import io.prometheus.client.CollectorRegistry;
+import io.prometheus.client.Counter;
+
+// Create custom metrics
+Counter customCounter = Counter.build()
+    .name("my_custom_metric_total")
+    .help("My custom metric")
+    .register();
+
+// Use the metric
+customCounter.inc();
+```
--- a/docs/blog/2023-12-01-welcome/index.md
+++ b/docs/blog/2023-12-01-welcome/index.md
@@ -1,9 +0,0 @@
---
-slug: welcome
-title: Welcome
-authors: [ amith ]
-tags: [ Java, AI, LLM, GenAI, GenerativeAI, Ollama, Ollama4J, OpenSource, Developers
-]
---
-
-Welcome Java Developers!
--- a/docs/blog/2023-12-22-release-post.md
+++ b/docs/blog/2023-12-22-release-post.md
@@ -1,6 +1,6 @@
 ---
 slug: release-post
-title: Release
+title: First Release 🚀
 authors: [ amith ]
 tags: [ Java, AI, LLM, GenAI, GenerativeAI, Ollama, Ollama4j, OpenSource, Developers
 ]
@@ -23,6 +23,8 @@ applications! 🌐🚀
 - Ability to ask questions along with image files or image URLs! 🤩
 - Open-source and primed for collaborative contributions from the community!

+<!-- truncate -->
+
 🦙 What is Ollama?

 Ollama is an advanced AI tool that allows users to easily set up and run large language models locally (in CPU and GPU
--- a/docs/blog/2025-03-08-blog/index.md
+++ b/docs/blog/2025-03-08-blog/index.md
@@ -0,0 +1,710 @@
+---
+slug: talk-to-your-data-on-couchbase-via-ollama4j
+title: "Talk to Your Data Using Natural Language: A Guide to Interacting with Couchbase via Ollama4j"
+authors: [ amith ]
+tags: [ Java, AI, LLM, GenAI, GenerativeAI, Generative AI Tools, Ollama, Ollama4J, OpenSource, Developers,
+]
+---
+
+Sometime back, I created a small wrapper called Ollama4j to interact with the Ollama server over the REST API in Java as
+a side project and made the [repository](https://github.com/ollama4j/ollama4j) public on GitHub. Over time, the project
+gained traction, with many fellow Java
+developers contributing, and it now boasts over _300 stars_! 😍
+
+We’ve consistently introduced new features, and when we added the tool-calling capability, the library became incredibly
+powerful, opening up so many possibilities. With this addition, we could automate numerous tasks using natural language!
+I wanted to share how to make the most of this functionality.
+
+In this article, we’ll explore how to use Ollama4j, a Java SDK for interacting with Ollama-hosted models, to leverage
+tool-calling models like Mistral for querying a Couchbase database. The goal is to create a system where you can query
+your database using natural, conversational language — just like interacting with a virtual assistant. We’ll walk you
+through the code, explain the key components, and show you how to set up your environment to ensure everything runs
+smoothly.
+
+<!-- truncate -->
+
+### Overview of the Technologies Involved
+
+Before diving into the implementation, let’s understand the core technologies we’re using:
+
+- **Ollama4j**: A Java SDK that interacts with hosted AI models through a convenient API. Ollama allows you to interact
+  with
+  pre-trained models (like Mistral) and access additional tools that can be applied to real-world tasks.
+- **Mistral**: A powerful, language-based model that can be used for a variety of tasks, including answering questions,
+  text
+  generation, and data retrieval from external sources. While I’ve used Mistral in this instance, you can easily replace
+  it with [any other model](https://ollama.com/search?c=tools) that supports tool-calling capabilities.
+- **Couchbase**: A NoSQL database that provides a flexible and scalable data model. In this example, we’ll query a
+  Couchbase
+  database to retrieve airline information.
+
+The magic happens when we combine these technologies to allow the model to query the database in a more intuitive and
+human-like way, acting as an interface between the user’s natural language and Couchbase’s structured data.
+
+
+> Oh, by the way, you can either set up
+> a [Couchbase server](https://www.couchbase.com/downloads/?family=couchbase-server) on your own or, if you prefer a
+> more
+> effortless approach like
+> I
+> do, give [Couchbase Capella](https://www.couchbase.com/products/capella/) a spin. It’s a fully managed
+> Database-as-a-Service (DBaaS) with a free tier 🎉 that’s so
+> easy
+> to set up, you’ll be querying your data in no time. It’s perfect for developers who want to dive in without any
+> hassle —
+> it’s like having your own cloud database, minus the headache!
+
+
+In the following section, we will walk you through the simple steps to create your free Couchbase Capella database
+cluster. If you’d prefer to set up your own Couchbase server elsewhere, feel free to skip this section and go directly
+to the [Code Environment Setup](#setting-up-the-environment-for-code) section.
+
+Sign up for a free database cluster on Couchbase Capella
+Head over to https://cloud.couchbase.com/sign-in and sign up for an account.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*vsJC0ugfoh9vpYNapt4-5A.png'} />
+
+Once you’re in, you will be able to create a new database cluster. Click on the _**Operational**_ tab and click on the
+**_Create Cluster_** button.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*ZNicgmYNkclgaBIxwRN7Ug.png'} />
+
+Select the default project named **_My First Project_** and click on the **_Continue_** button.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*vfc2cF7IgkjLtNXvls8giQ.png'} />
+
+You’ll now see the available cluster options. Go ahead and select the **_Free_** option! 😍
+Next, choose your preferred cloud provider (you can select any provider or stick with the default AWS provider).
+Pick a region (or leave it set to the default).
+Finally, click on the Create Cluster button to proceed.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*rdWpeSrUaBKC6Y5q8Kd6EA.png'} />
+
+Give it a couple of minutes, and let the magic happen as your cluster gets deployed.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*no3uHx8cIzVBn7qccYEZ3A.png'} />
+
+Once your cluster is deployed, you’ll see the status of your cluster as **_Healthy_**.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Jyu9uiSDSE0o-EQRb53CJA.png'} />
+
+Click on the listed cluster to open its details. Here, you can view the version of the deployed Couchbase server, the
+enabled services, as well as the cloud provider and region.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Sv-7wQuAoD0l0bjbI5I7Aw.png'} />
+
+Click on **_Explore Data_** button. Notice that a default bucket called **_travel-sample_** with some sample data has
+been created
+for you.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*z85GsgMBvdR2mrvKUrIjJg.png'} />
+
+Browse through the collection to explore the pre-created buckets, scopes and collections available to you.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Qr84bs1dvn6m9ZjkNxXvUg.png'} />
+
+Open up a sample document from the **_travel-sample_** (bucket) > **_inventory_** (scope) > **_airline_** (collection)
+to see the contents
+of the document.
+
+The document shown in the image below is about an airline named **_Astraeus_**, whose call sign (a unique name or code
+used to
+identify an airline or aircraft in communication) is **_FLYSTAR_**.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*AmvixYfdNNKC6nXNNXbe4Q.png'} />
+
+Navigate to the **_Connect_** tab, and you will see a **_Public Connection String_** that allows you to access the
+Capella cluster
+endpoint from your client application, which looks like the following URL:
+
+```
+couchbases://cb.uniqueclusteridentifer.cloud.couchbase.com
+```
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*jwnVdj5ZOQMHoggj9JZeJQ.png'} />
+
+To access this cluster endpoint, you need to allow the IP addresses that are permitted to connect. Click on the
+**_Settings_**
+tab, which will take you to the **_Cluster Settings_** view. Then, click on **_Allowed IP Addresses_** in the left pane
+under
+**_Networking_**, where you can add allowed IP addresses. Then, click on the **_Add Allowed IP_** button.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*tS83AJaNzlBa4Q3aadxohw.png'} />
+
+You can either click on the **_Add Current IP Address_** button to limit access to your cluster to your IP address
+alone, or
+if you’d like to allow access from anywhere, click on the **_Allow Access from Anywhere_** button.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*XBgqQoXQQJyYg51Ztugw6w.png'} />
+
+Confirm that you want to allow the IP addresses.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*WjfYQQaiT2WqwNnWvUCyww.png'} />
+
+The IP addresses have now been added to the allow list, and the networking is set up.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*5BHIp2rqUf7E_GNX8TENoA.png'} />
+
+Now that you’ve allowed IP addresses, it’s time to create credentials for accessing the cluster using a username and
+password. Click on the **_Cluster Access_** tab in the left pane, then click on the **_Create Cluster Access_** button.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*Q5l_EE3gGtxiANdkKilVTQ.png'} />
+
+Enter a username of your choice in the **_Cluster Access Name_** text field, and then enter a password of your choice in
+the
+**_Password_** text field.
+
+Next, select the bucket, scope, and the read/write permissions you want these credentials to have access to. In this
+example, I’ve granted access to all buckets and scopes with both read and write permissions.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*j2DRB1oDWE78SKpcsIb2SA.png'} />
+
+Alright, your cluster access is now set up.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*8TY-5DPDfQlwz0-2IYR8Sg.png'} />
+
+One last step: you just need to select the **_Cluster Access Credentials_** that you want to allow to connect to your
+Capella
+cluster. Head over to the **_Connect_** tab, then click on the **_SDKs_** tab in the left pane. Under Choose the
+**_Cluster Access Credentials you want to use to connect to your Capella cluster_**, select the cluster credentials you
+just created.
+
+<img src={'https://miro.medium.com/v2/resize:fit:1400/format:webp/1*sIlH51v2HllTzBDV8K-9Aw.png'} />
+
+Awesome! Your cluster access is all set up, and you’re ready to connect to your Capella cluster using a Couchbase
+client. That’s it — you’re all set and good to go!
+
+### Setting Up the Environment For Code
+
+Before you begin, ensure you have the following components setup.
+
+**Java**: Make sure you have Java 11+ installed on your system. Set it up
+from [here](https://www.oracle.com/in/java/technologies/downloads/). Verify it by running the following
+command in your terminal.
+
+```shell
+java --version
+```
+
+**Maven**: Make sure you have the Maven build system set up. Set it up from [here](https://maven.apache.org/download.cgi).
+Verify it by running the following command
+in your terminal.
+
+```
+mvn --version
+```
+
+**Ollama Server**: Make sure you have installed the latest version of [Ollama server](https://ollama.com/) and it is up
+and running. Verify it by
+running the following command in your terminal.
+
+```shell
+ollama --version
+```
+
+**Model**: You’ll need [tool-calling model](https://ollama.com/search?c=tools) (such as Mistral) downloaded and ready to
+serve from your Ollama server.
+
+To download/pull the model into your Ollama server, run the following command in your terminal.
+
+```shell
+ollama pull mistral
+
+```
+
+You can list the models available on your model server by running the following command in your terminal.
+
+```shell
+ollama list
+```
+
+Once you have these, you can start setting up the application.
+
+Setup `pom.xml` for your Maven project.
+
+```xml
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+    <modelVersion>4.0.0</modelVersion>
+
+    <groupId>io.github.ollama4j.couchbase</groupId>
+    <artifactId>ollama4j-couchbase</artifactId>
+    <version>0.0.1</version>
+    <name>Ollama4j Couchbase</name>
+    <description>Talk to your data in Couchbase over Ollama4j</description>
+    <packaging>jar</packaging>
+
+    <properties>
+        <maven.compiler.release>11</maven.compiler.release>
+        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+        <lombok.version>1.18.30</lombok.version>
+    </properties>
+
+    <build>
+        <plugins>
+            <plugin>
+                <groupId>org.apache.maven.plugins</groupId>
+                <artifactId>maven-source-plugin</artifactId>
+                <version>3.3.1</version>
+                <executions>
+                    <execution>
+                        <id>attach-sources</id>
+                        <goals>
+                            <goal>jar-no-fork</goal>
+                        </goals>
+                    </execution>
+                </executions>
+            </plugin>
+            <plugin>
+                <groupId>org.apache.maven.plugins</groupId>
+                <artifactId>maven-javadoc-plugin</artifactId>
+                <version>3.11.2</version>
+                <configuration>
+                    <!-- to disable the "missing" warnings. Remove the doclint to enable warnings-->
+                    <doclint>all,-missing</doclint>
+                </configuration>
+                <executions>
+                    <execution>
+                        <id>attach-javadocs</id>
+                        <goals>
+                            <goal>jar</goal>
+                        </goals>
+                    </execution>
+                </executions>
+            </plugin>
+        </plugins>
+
+        <pluginManagement>
+            <plugins>
+                <plugin>
+                    <artifactId>maven-compiler-plugin</artifactId>
+                    <version>3.14.0</version>
+                </plugin>
+            </plugins>
+        </pluginManagement>
+    </build>
+
+    <dependencies>
+        <dependency>
+            <groupId>io.github.ollama4j</groupId>
+            <artifactId>ollama4j</artifactId>
+            <version>ollama4j-revision</version>
+        </dependency>
+
+        <!-- SLF4J API -->
+        <dependency>
+            <groupId>org.slf4j</groupId>
+            <artifactId>slf4j-api</artifactId>
+            <version>2.0.0</version>
+        </dependency>
+
+        <!-- Logback Classic (SLF4J binding) -->
+        <dependency>
+            <groupId>ch.qos.logback</groupId>
+            <artifactId>logback-classic</artifactId>
+            <version>1.4.12</version>
+        </dependency>
+
+        <dependency>
+            <groupId>com.couchbase.client</groupId>
+            <artifactId>java-client</artifactId>
+            <version>3.7.8</version>
+        </dependency>
+
+        <dependency>
+            <groupId>org.projectlombok</groupId>
+            <artifactId>lombok</artifactId>
+            <version>1.18.30</version>
+            <scope>provided</scope>
+        </dependency>
+    </dependencies>
+</project>
+```
+
+### Code Walkthrough
+
+Here’s the main part of the implementation in the Java code.
+
+```java
+package io.github.ollama4j.examples;
+
+
+import com.couchbase.client.java.Bucket;
+import com.couchbase.client.java.Cluster;
+import com.couchbase.client.java.ClusterOptions;
+import com.couchbase.client.java.Scope;
+import com.couchbase.client.java.json.JsonObject;
+import com.couchbase.client.java.query.QueryResult;
+import io.github.ollama4j.Ollama;
+import io.github.ollama4j.exceptions.OllamaException;
+import io.github.ollama4j.exceptions.ToolInvocationException;
+import io.github.ollama4j.tools.OllamaToolsResult;
+import io.github.ollama4j.tools.ToolFunction;
+import io.github.ollama4j.tools.Tools;
+import io.github.ollama4j.utils.OptionsBuilder;
+import io.github.ollama4j.utils.Utilities;
+import lombok.AllArgsConstructor;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.io.IOException;
+import java.time.Duration;
+import java.util.Arrays;
+import java.util.Map;
+
+
+public class CouchbaseToolCallingExample {
+
+  public static void main(String[] args) throws IOException, ToolInvocationException, OllamaException, InterruptedException {
+    String connectionString = Utilities.getFromEnvVar("CB_CLUSTER_URL");
+    String username = Utilities.getFromEnvVar("CB_CLUSTER_USERNAME");
+    String password = Utilities.getFromEnvVar("CB_CLUSTER_PASSWORD");
+    String bucketName = "travel-sample";
+
+    Cluster cluster = Cluster.connect(
+            connectionString,
+            ClusterOptions.clusterOptions(username, password).environment(env -> {
+              env.applyProfile("wan-development");
+            })
+    );
+
+    String host = Utilities.getFromConfig("host");
+    String modelName = Utilities.getFromConfig("tools_model_mistral");
+
+    Ollama ollama = new Ollama(host);
+    ollama.setRequestTimeoutSeconds(60);
+
+    Tools.ToolSpecification callSignFinderToolSpec = getCallSignFinderToolSpec(cluster, bucketName);
+    Tools.ToolSpecification callSignUpdaterToolSpec = getCallSignUpdaterToolSpec(cluster, bucketName);
+
+    ollama.registerTool(callSignFinderToolSpec);
+    ollama.registerTool(callSignUpdaterToolSpec);
+
+    String prompt1 = "What is the call-sign of Astraeus?";
+    for (OllamaToolsResult.ToolResult r : ollama.generateWithTools(modelName, new Tools.PromptBuilder()
+            .withToolSpecification(callSignFinderToolSpec)
+            .withPrompt(prompt1)
+            .build(), new OptionsBuilder().build()).getToolResults()) {
+      AirlineDetail airlineDetail = (AirlineDetail) r.getResult();
+      System.out.println(String.format("[Result of tool '%s']: Call-sign of %s is '%s'! ✈️", r.getFunctionName(), airlineDetail.getName(), airlineDetail.getCallsign()));
+    }
+
+    String prompt2 = "I want to code name Astraeus as STARBOUND";
+    for (OllamaToolsResult.ToolResult r : ollama.generateWithTools(modelName, new Tools.PromptBuilder()
+            .withToolSpecification(callSignUpdaterToolSpec)
+            .withPrompt(prompt2)
+            .build(), new OptionsBuilder().build()).getToolResults()) {
+      Boolean updated = (Boolean) r.getResult();
+      System.out.println(String.format("[Result of tool '%s']: Call-sign is %s! ✈️", r.getFunctionName(), updated ? "updated" : "not updated"));
+    }
+
+    String prompt3 = "What is the call-sign of Astraeus?";
+    for (OllamaToolsResult.ToolResult r : ollama.generateWithTools(modelName, new Tools.PromptBuilder()
+            .withToolSpecification(callSignFinderToolSpec)
+            .withPrompt(prompt3)
+            .build(), new OptionsBuilder().build()).getToolResults()) {
+      AirlineDetail airlineDetail = (AirlineDetail) r.getResult();
+      System.out.println(String.format("[Result of tool '%s']: Call-sign of %s is '%s'! ✈️", r.getFunctionName(), airlineDetail.getName(), airlineDetail.getCallsign()));
+    }
+  }
+
+  public static Tools.ToolSpecification getCallSignFinderToolSpec(Cluster cluster, String bucketName) {
+    return Tools.ToolSpecification.builder()
+            .functionName("airline-lookup")
+            .functionDescription("You are a tool who finds only the airline name and do not worry about any other parameters. You simply find the airline name and ignore the rest of the parameters. Do not validate airline names as I want to use fake/fictitious airline names as well.")
+            .toolFunction(new AirlineCallsignQueryToolFunction(bucketName, cluster))
+            .toolPrompt(
+                    Tools.PromptFuncDefinition.builder()
+                            .type("prompt")
+                            .function(
+                                    Tools.PromptFuncDefinition.PromptFuncSpec.builder()
+                                            .name("get-airline-name")
+                                            .description("Get the airline name")
+                                            .parameters(
+                                                    Tools.PromptFuncDefinition.Parameters.builder()
+                                                            .type("object")
+                                                            .properties(
+                                                                    Map.of(
+                                                                            "airlineName", Tools.PromptFuncDefinition.Property.builder()
+                                                                                    .type("string")
+                                                                                    .description("The name of the airline. e.g. Emirates")
+                                                                                    .required(true)
+                                                                                    .build()
+                                                                    )
+                                                            )
+                                                            .required(java.util.List.of("airline-name"))
+                                                            .build()
+                                            )
+                                            .build()
+                            )
+                            .build()
+            )
+            .build();
+  }
+
+  public static Tools.ToolSpecification getCallSignUpdaterToolSpec(Cluster cluster, String bucketName) {
+    return Tools.ToolSpecification.builder()
+            .functionName("airline-update")
+            .functionDescription("You are a tool who finds the airline name and its callsign and do not worry about any validations. You simply find the airline name and its callsign. Do not validate airline names as I want to use fake/fictitious airline names as well.")
+            .toolFunction(new AirlineCallsignUpdateToolFunction(bucketName, cluster))
+            .toolPrompt(
+                    Tools.PromptFuncDefinition.builder()
+                            .type("prompt")
+                            .function(
+                                    Tools.PromptFuncDefinition.PromptFuncSpec.builder()
+                                            .name("get-airline-name-and-callsign")
+                                            .description("Get the airline name and callsign")
+                                            .parameters(
+                                                    Tools.PromptFuncDefinition.Parameters.builder()
+                                                            .type("object")
+                                                            .properties(
+                                                                    Map.of(
+                                                                            "airlineName", Tools.PromptFuncDefinition.Property.builder()
+                                                                                    .type("string")
+                                                                                    .description("The name of the airline. e.g. Emirates")
+                                                                                    .required(true)
+                                                                                    .build(),
+                                                                            "airlineCallsign", Tools.PromptFuncDefinition.Property.builder()
+                                                                                    .type("string")
+                                                                                    .description("The callsign of the airline. e.g. Maverick")
+                                                                                    .enumValues(Arrays.asList("petrol", "diesel"))
+                                                                                    .required(true)
+                                                                                    .build()
+                                                                    )
+                                                            )
+                                                            .required(java.util.List.of("airlineName", "airlineCallsign"))
+                                                            .build()
+                                            )
+                                            .build()
+                            )
+                            .build()
+            )
+            .build();
+  }
+}
+
+class AirlineCallsignQueryToolFunction implements ToolFunction {
+  private final String bucketName;
+  private final Cluster cluster;
+
+  public AirlineCallsignQueryToolFunction(String bucketName, Cluster cluster) {
+    this.bucketName = bucketName;
+    this.cluster = cluster;
+  }
+
+  @Override
+  public AirlineDetail apply(Map<String, Object> arguments) {
+    String airlineName = arguments.get("airlineName").toString();
+
+    Bucket bucket = cluster.bucket(bucketName);
+    bucket.waitUntilReady(Duration.ofSeconds(10));
+
+    Scope inventoryScope = bucket.scope("inventory");
+    QueryResult result = inventoryScope.query(String.format("SELECT * FROM airline WHERE name = '%s';", airlineName));
+
+    JsonObject row = (JsonObject) result.rowsAsObject().get(0).get("airline");
+    return new AirlineDetail(row.getString("callsign"), row.getString("name"), row.getString("country"));
+  }
+}
+
+class AirlineCallsignUpdateToolFunction implements ToolFunction {
+  private final String bucketName;
+  private final Cluster cluster;
+
+  public AirlineCallsignUpdateToolFunction(String bucketName, Cluster cluster) {
+    this.bucketName = bucketName;
+    this.cluster = cluster;
+  }
+
+
+  @Override
+  public Boolean apply(Map<String, Object> arguments) {
+    String airlineName = arguments.get("airlineName").toString();
+    String airlineNewCallsign = arguments.get("airlineCallsign").toString();
+
+    Bucket bucket = cluster.bucket(bucketName);
+    bucket.waitUntilReady(Duration.ofSeconds(10));
+
+    Scope inventoryScope = bucket.scope("inventory");
+    String query = String.format("SELECT * FROM airline WHERE name = '%s';", airlineName);
+
+    QueryResult result;
+    try {
+      result = inventoryScope.query(query);
+    } catch (Exception e) {
+      throw new RuntimeException("Error executing query", e);
+    }
+
+    if (result.rowsAsObject().isEmpty()) {
+      throw new RuntimeException("Airline not found with name: " + airlineName);
+    }
+
+    JsonObject row = (JsonObject) result.rowsAsObject().get(0).get("airline");
+
+    if (row == null) {
+      throw new RuntimeException("Airline data is missing or corrupted.");
+    }
+
+    String currentCallsign = row.getString("callsign");
+
+    if (!airlineNewCallsign.equals(currentCallsign)) {
+      JsonObject updateQuery = JsonObject.create()
+              .put("callsign", airlineNewCallsign);
+
+      inventoryScope.query(String.format(
+              "UPDATE airline SET callsign = '%s' WHERE name = '%s';",
+              airlineNewCallsign, airlineName
+      ));
+      return true;
+    }
+    return false;
+  }
+}
+
+@SuppressWarnings("ALL")
+@Data
+@AllArgsConstructor
+@NoArgsConstructor
+class AirlineDetail {
+  private String callsign;
+  private String name;
+  private String country;
+}
+```
+
+### Key Concepts
+
+#### 1. Ollama API Client Setup
+
+```javascript
+OllamaAPI ollama = new OllamaAPI(host);
+
+ollama.setRequestTimeoutSeconds(60);
+```
+
+Here, we initialize the Ollama API client and configure it with the host of the Ollama server, where the model is hosted
+and can handle API requests. Additionally, we set the request timeout to 60 seconds to ensure that even if the model
+takes longer to respond, the request will still be processed.
+
+#### 2. Tool Specification
+
+The ToolSpecification class defines how the model will interact with the Couchbase database. We define a function that
+queries the database for airline details based on the airline name.
+
+```javascript
+Tools.ToolSpecification callSignFinderToolSpec = getCallSignFinderToolSpec(cluster, bucketName);
+
+ollama.registerTool(callSignFinderToolSpec);
+```
+
+This step registers custom tools with Ollama that allows the tool-calling model to invoke database queries.
+
+#### 3. Query Execution
+
+The tool will execute a Couchbase N1QL query to retrieve the airline details:
+
+```javascript
+QueryResult result = inventoryScope.query(String.format("SELECT * FROM airline WHERE name = '%s';", airlineName));
+```
+
+The result is processed and returned as an AirlineDetail object.
+
+#### 4. Set up your prompt (question)
+
+```javascript
+String prompt = "What is the call-sign of Astraeus?";
+```
+
+#### 5. Generating Results with Tools
+
+```javascript
+for (OllamaToolsResult.ToolResult r : ollama.generateWithTools(modelName, new Tools.PromptBuilder()
+        .withToolSpecification(callSignFinderToolSpec)
+        .withPrompt(prompt)
+        .build(), new OptionsBuilder().build()).getToolResults()) {
+    AirlineDetail airlineDetail = (AirlineDetail) r.getResult();
+    System.out.printf("[Result of tool '%s']: Call-sign of %s is '%s'! ✈️", r.getFunctionName(), airlineDetail.getName(), airlineDetail.getCallsign());
+}
+```
+
+This invokes the tool-calling model (Mistral in this case) with the provided prompt and uses the registered tool to
+query the database. The result is returned and printed to the console.
+
+So, we ask the following question to the model.
+
+> **What is the call-sign of Astraeus?**
+
+And, here’s what the model responds:
+
+> **Call-sign of Astraeus is ‘FLYSTAR’! ✈️**
+
+Isn’t that amazing? Now, let’s enhance it further by adding a function that allows us to update an airline’s call sign
+using natural language.
+
+Let’s define another `ToolSpecificationclass` that defines how the model will interact with the Couchbase database to
+update the database. We define a function that queries the database for airline details based on the airline name and
+then update the airline’s callsign.
+
+```javascript
+Tools.ToolSpecification callSignUpdaterToolSpec = getCallSignUpdaterToolSpec(cluster, bucketName);
+
+ollama.registerTool(callSignUpdaterToolSpec);
+```
+
+The tool will execute a Couchbase N1QL query to update the airline’s callsign.
+
+```javascript
+inventoryScope.query(String.format(
+        "UPDATE airline SET callsign = '%s' WHERE name = '%s';",
+        airlineNewCallsign, airlineName
+));
+```
+
+Setup the prompt to instruct the model to update the airline’s callsign.
+
+```javascript
+String prompt = "I want to code name Astraeus as STARBOUND";
+```
+
+And then we invoke the model with the new prompt.
+
+```javascript
+String prompt = "I want to code name Astraeus as STARBOUND";
+for (OllamaToolsResult.ToolResult r : ollama.generateWithTools(modelName, new Tools.PromptBuilder()
+        .withToolSpecification(callSignUpdaterToolSpec)
+        .withPrompt(prompt)
+        .build(), new OptionsBuilder().build()).getToolResults()) {
+    Boolean updated = (Boolean) r.getResult();
+    System.out.println(String.format("[Result of tool '%s']: Call-sign is %s! ✈️", r.getFunctionName(), updated ? "updated" : "not updated"));
+}
+```
+
+This invokes the tool-calling model (Mistral in this case) with the new prompt and uses the registered tool to update
+the database.
+
+So, we ask the following question to the model.
+
+> **I want to code name Astraeus as STARBOUND.**
+
+And, here’s what the model responds:
+
+> **Call-sign is updated! ✈️**
+
+How amazing is that? The possibilities for interacting with your data using natural language are endless. You could
+integrate features like checking flight availability, booking tickets, retrieving ticket details, and so much more!
+
+Feel free to extend this example further by adding more sophisticated capabilities! 🚀
+
+### Conclusion
+
+With the code above, you can use Ollama’s hosted models (like Mistral) to query a Couchbase database using natural
+language prompts. This makes it possible to interact with databases in a more intuitive and human-like way.
+
+By leveraging Ollama4j, you can connect AI models to real-world applications and build powerful tools that can automate
+complex tasks or simply make querying your data more conversational.
+
+You can find the full code and more such examples from
+the [ollama4j-examples](https://github.com/ollama4j/ollama4j-examples) GitHub repository.
+
+Credit to Couchbase, Ollama, and all the model teams for providing us with such amazing software!
--- a/docs/docs/agent.md
+++ b/docs/docs/agent.md
@@ -0,0 +1,60 @@
+---
+sidebar_position: 4
+
+title: Agents
+---
+
+import CodeEmbed from '@site/src/components/CodeEmbed';
+
+# Agents
+
+Build powerful, flexible agents—backed by LLMs and tools—in a few minutes.
+
+Ollama4j’s agent system lets you bring together the best of LLM reasoning and external tool-use using a simple, declarative YAML configuration. No framework bloat, no complicated setup—just describe your agent, plug in your logic, and go.
+
+---
+
+**Why use agents in Ollama4j?**
+
+- **Effortless Customization:** Instantly adjust your agent’s persona, reasoning strategies, or domain by tweaking YAML. No need to touch your compiled Java code.
+- **Easy Extensibility:** Want new capabilities? Just add or change tools and logic classes—no framework glue or plumbing required.
+- **Fast Experimentation:** Mix-and-match models, instructions, and tools—prototype sophisticated behaviors or orchestrators in minutes.
+- **Clean Separation:** Keep business logic (Java) and agent personality/configuration (YAML) separate for maintainability and clarity.
+
+---
+
+## Define an Agent in YAML
+
+Specify everything about your agent—what LLM it uses, its “personality,” and all callable tools—in a single YAML file.
+
+**Agent YAML keys:**
+
+| Field                   | Description                                                                                                           |
+|-------------------------|-----------------------------------------------------------------------------------------------------------------------|
+| `name`                  | Name of your agent.                                                                                                   |
+| `host`                  | The base URL for your Ollama server (e.g., `http://localhost:11434`).                                                |
+| `model`                 | The LLM backing your agent (e.g., `llama2`, `mistral`, `mixtral`, etc).                                              |
+| `customPrompt`          | _(optional)_ System prompt—instructions or persona for your agent.                                                   |
+| `tools`                 | List of tools the agent can use. Each tool entry describes the name, function, and parameters.                        |
+| `toolFunctionFQCN`      | Fully qualified Java class name implementing the tool logic. Must be present on classpath.                            |
+| `requestTimeoutSeconds` | _(optional)_ How long (seconds) to wait for agent replies.                                                            |
+
+YAML makes it effortless to configure and tweak your agent’s powers and behavior—no code changes needed!
+
+**Example agent YAML:**
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/resources/agent.yaml" language='yaml'/>
+
+---
+
+## Instantiating and Running Agents in Java
+
+Once your agent is described in YAML, bringing it to life in Java takes only a couple of lines:
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/AgentExample.java"/>
+
+- **No boilerplate.** Just load and start chatting or calling tools.
+- The API takes care of wiring up LLMs, tool invocation, and instruction handling.
+
+Ready to build your own AI-powered assistant? Just write your YAML, implement the tool logic in Java, and go!
+
--- a/docs/docs/apis-extras/_category_.json
+++ b/docs/docs/apis-extras/_category_.json
@@ -1,8 +1,8 @@
 {
-  "label": "APIs - Extras",
-  "position": 4,
-  "link": {
-    "type": "generated-index",
-    "description": "Details of APIs to handle bunch of extra stuff."
-  }
+    "label": "Extras",
+    "position": 5,
+    "link": {
+        "type": "generated-index",
+        "description": "Details of APIs to handle bunch of extra stuff."
+    }
 }
--- a/docs/docs/apis-extras/basic-auth.md
+++ b/docs/docs/apis-extras/basic-auth.md
@@ -1,8 +1,8 @@
 ---
-sidebar_position: 2
+sidebar_position: 3
 ---

-# Set Basic Authentication
+# Basic Auth

 This API lets you set the basic authentication for the Ollama client. This would help in scenarios where
 Ollama server would be setup behind a gateway/reverse proxy with basic auth.
@@ -10,7 +10,7 @@ Ollama server would be setup behind a gateway/reverse proxy with basic auth.
 After configuring basic authentication, all subsequent requests will include the Basic Auth header.

 ```java
-import io.github.ollama4j.OllamaAPI;
+import io.github.ollama4j.Ollama;

 public class Main {

@@ -18,9 +18,9 @@ public class Main {

        String host = "http://localhost:11434/";

-        OllamaAPI ollamaAPI = new OllamaAPI(host);
+        Ollama ollama = new Ollama(host);

-        ollamaAPI.setBasicAuth("username", "password");
+        ollama.setBasicAuth("username", "password");
    }
 }
 ```
--- a/docs/docs/apis-extras/bearer-auth.md
+++ b/docs/docs/apis-extras/bearer-auth.md
@@ -0,0 +1,26 @@
+---
+sidebar_position: 4
+---
+
+# Bearer Auth
+
+This API lets you set the bearer authentication for the Ollama client. This would help in scenarios where
+Ollama server would be setup behind a gateway/reverse proxy with bearer auth.
+
+After configuring bearer authentication, all subsequent requests will include the Bearer Auth header.
+
+```java
+import io.github.ollama4j.Ollama;
+
+public class Main {
+
+    public static void main(String[] args) {
+
+        String host = "http://localhost:11434/";
+
+        Ollama ollama = new Ollama(host);
+
+        ollama.setBearerAuth("YOUR-TOKEN");
+    }
+}
+```
--- a/docs/docs/apis-extras/logging.md
+++ b/docs/docs/apis-extras/logging.md
@@ -0,0 +1,26 @@
+---
+sidebar_position: 7
+---
+
+# Logging
+
+### Using with SLF4J and Logback
+
+Add a `logback.xml` file to your `src/main/resources` folder with the following content:
+
+```xml
+
+<configuration>
+    <root level="DEBUG">
+        <appender-ref ref="STDOUT"/>
+    </root>
+    
+    <appender name="STDOUT" class="ch.qos.logback.core.ConsoleAppender">
+        <encoder>
+            <pattern>%d{yyyy-MM-dd HH:mm:ss} %-5level %logger{36} - %msg%n</pattern>
+        </encoder>
+    </appender>
+
+</configuration>
+
+```
--- a/docs/docs/apis-extras/options-builder.md
+++ b/docs/docs/apis-extras/options-builder.md
@@ -36,7 +36,7 @@ from [javadoc](https://ollama4j.github.io/ollama4j/apidocs/io/github/ollama4j/ol
 ## Build an empty `Options` object

 ```java
-import io.github.ollama4j.OllamaAPI;
+import io.github.ollama4j.Ollama;
 import io.github.ollama4j.utils.Options;
 import io.github.ollama4j.utils.OptionsBuilder;

@@ -46,7 +46,7 @@ public class Main {

        String host = "http://localhost:11434/";

-        OllamaAPI ollamaAPI = new OllamaAPI(host);
+        Ollama ollama = new Ollama(host);

        Options options = new OptionsBuilder().build();
    }
@@ -65,7 +65,7 @@ public class Main {

        String host = "http://localhost:11434/";

-        OllamaAPI ollamaAPI = new OllamaAPI(host);
+        Ollama ollama = new Ollama(host);

        Options options =
                new OptionsBuilder()
@@ -74,6 +74,15 @@ public class Main {
                        .setNumGpu(2)
                        .setTemperature(1.5f)
                        .build();
+
+        OllamaResult result =
+                ollama.generate(
+                        OllamaGenerateRequestBuilder.builder()
+                                .withModel(model)
+                                .withPrompt("Who are you?")
+                                .withOptions(options)
+                                .build(),
+                        null);
    }
 }
 ```
--- a/docs/docs/apis-extras/ping.md
+++ b/docs/docs/apis-extras/ping.md
@@ -1,5 +1,5 @@
 ---
-sidebar_position: 3
+sidebar_position: 5
 ---

 # Ping
@@ -7,16 +7,16 @@ sidebar_position: 3
 This API lets you check the reachability of Ollama server.

 ```java
-import io.github.ollama4j.OllamaAPI;
+import io.github.ollama4j.Ollama;

 public class Main {

    public static void main(String[] args) {
        String host = "http://localhost:11434/";
-        
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        
-        ollamaAPI.ping();
+
+        Ollama ollama = new Ollama(host);
+
+        ollama.ping();
    }
 }
 ```
--- a/docs/docs/apis-generate/prompt-builder.md
+++ b/docs/docs/apis-generate/prompt-builder.md
@@ -1,5 +1,5 @@
 ---
-sidebar_position: 6
+sidebar_position: 2
 ---

 # Prompt Builder
@@ -8,7 +8,7 @@ This is designed for prompt engineering. It allows you to easily build the promp
 inferences.

 ```java
-import io.github.ollama4j.OllamaAPI;
+import io.github.ollama4j.Ollama;
 import io.github.ollama4j.models.response.OllamaResult;
 import io.github.ollama4j.types.OllamaModelType;
 import io.github.ollama4j.utils.OptionsBuilder;
@@ -18,8 +18,8 @@ public class Main {
    public static void main(String[] args) throws Exception {

        String host = "http://localhost:11434/";
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        ollamaAPI.setRequestTimeoutSeconds(10);
+        Ollama ollama = new Ollama(host);
+        ollama.setRequestTimeoutSeconds(10);

        String model = OllamaModelType.PHI;

@@ -43,7 +43,7 @@ public class Main {
                        .add("How do I read a file in Go and print its contents to stdout?");

        boolean raw = false;
-        OllamaResult response = ollamaAPI.generate(model, promptBuilder.build(), raw, new OptionsBuilder().build());
+        OllamaResult response = ollama.generate(model, promptBuilder.build(), raw, new OptionsBuilder().build());
        System.out.println(response.getResponse());
    }
 }
@@ -51,6 +51,7 @@ public class Main {

 You will get a response similar to:

+:::tip[LLM Response]
 ```go
 package main

@@ -71,4 +72,5 @@ func readFile(fileName string) {
        fmt.Println(f.String())
    }
 }
-```
+```
+:::
--- a/docs/docs/apis-extras/ps.md
+++ b/docs/docs/apis-extras/ps.md
@@ -1,5 +1,5 @@
 ---
-sidebar_position: 4
+sidebar_position: 5
 ---

 # PS
@@ -11,18 +11,15 @@ This API corresponds to the [PS](https://github.com/ollama/ollama/blob/main/docs
 ```java
 package io.github.ollama4j.localtests;

-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.exceptions.OllamaBaseException;
-import io.github.ollama4j.models.ps.ModelsProcessResponse;
-
-import java.io.IOException;
+import io.github.ollama4j.Ollama;
+import io.github.ollama4j.models.ps.ModelProcessesResult;

 public class Main {
    public static void main(String[] args) {

-        OllamaAPI ollamaAPI = new OllamaAPI("http://localhost:11434");
+        Ollama ollama = new Ollama("http://localhost:11434");

-        ModelsProcessResponse response = ollamaAPI.ps();
+        ModelProcessesResult response = ollama.ps();

        System.out.println(response);
    }
--- a/docs/docs/apis-extras/request-timeout.md
+++ b/docs/docs/apis-extras/request-timeout.md
@@ -1,23 +0,0 @@
---
-sidebar_position: 2
---
-
-# Set Request Timeout
-
-This API lets you set the request timeout for the Ollama client.
-
-```java
-import io.github.ollama4j.OllamaAPI;
-
-public class Main {
-
-  public static void main(String[] args) {
-
-    String host = "http://localhost:11434/";
-
-    OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-    ollamaAPI.setRequestTimeoutSeconds(10);
-  }
-}
-```
--- a/docs/docs/apis-extras/timeouts.md
+++ b/docs/docs/apis-extras/timeouts.md
@@ -0,0 +1,25 @@
+---
+sidebar_position: 2
+---
+
+# Timeouts
+
+### Set Request Timeout
+
+This API lets you set the request timeout for the Ollama client.
+
+```java
+import io.github.ollama4j.Ollama;
+
+public class Main {
+
+    public static void main(String[] args) {
+
+        String host = "http://localhost:11434/";
+
+        Ollama ollama = new Ollama(host);
+
+        ollama.setRequestTimeoutSeconds(10);
+    }
+}
+```
--- a/docs/docs/apis-extras/verbosity.md
+++ b/docs/docs/apis-extras/verbosity.md
@@ -1,25 +0,0 @@
---
-sidebar_position: 1
---
-
-# Set Verbosity
-
-This API lets you set the verbosity of the Ollama client.
-
-## Try asking a question about the model.
-
-```java
-import io.github.ollama4j.OllamaAPI;
-
-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        ollamaAPI.setVerbose(true);
-    }
-}
-```
--- a/docs/docs/apis-generate/_category_.json
+++ b/docs/docs/apis-generate/_category_.json
@@ -1,8 +1,8 @@
 {
-  "label": "APIs - Generate",
-  "position": 3,
-  "link": {
-    "type": "generated-index",
-    "description": "Details of APIs to interact with LLMs."
-  }
+    "label": "Generate",
+    "position": 3,
+    "link": {
+        "type": "generated-index",
+        "description": "Details of APIs to interact with LLMs."
+    }
 }
--- a/docs/docs/apis-generate/chat-with-thinking.md
+++ b/docs/docs/apis-generate/chat-with-thinking.md
@@ -0,0 +1,92 @@
+---
+sidebar_position: 8
+---
+
+import CodeEmbed from '@site/src/components/CodeEmbed';
+import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
+
+# Chat with Thinking
+
+This API allows to generate responses from an LLM while also retrieving the model's "thinking" process separately from
+the final answer. The "thinking" tokens represent the model's internal reasoning or planning before it produces the
+actual response. This can be useful for debugging, transparency, or simply understanding how the model arrives at its
+answers.
+
+You can use this feature to receive both the thinking and the response as separate outputs, either as a complete result
+or streamed token by token. The examples below show how to use the API to access both the thinking and the response, and
+how to display them in your application.
+
+### Chat with thinking model and receive the thinking and response text separately
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithThinkingModelExample.java" />
+
+You will get a response similar to:
+
+:::tip[LLM Response]
+**First thinking response:** User asks a simple question. We just answer.
+
+**First answer response:** The capital of France is _**Paris**_.
+
+**Second thinking response:** User: "And what is the second largest city?" They asked about the second largest city in
+France. Provide answer: Paris largest, second largest is Marseille. We can provide population stats, maybe mention Lyon
+as third largest. Also context. The answer should be concise. Provide some details: Marseille is the second largest,
+population ~870k, located on Mediterranean coast. Provide maybe some facts. Given no request for extra context, just answer.
+
+**Second answer response:** The second‑largest city in France is _**Marseille**_. It’s a major Mediterranean port with a
+population of roughly 870,000 (as of the latest estimates) and is known for its historic Old Port, vibrant cultural
+scene, and diverse population.
+:::
+
+### Chat with thinking model and receive the thinking and response tokens streamed
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatStreamingWithThinkingExample.java" />
+
+You will get a response similar to:
+
+:::tip[First Question's Thinking Tokens]
+<TypewriterTextarea
+textContent={`USER ASKS A SIMPLE QUESTION: "WHAT IS THE CAPITAL OF FRANCE?" THE ANSWER: PARIS. PROVIDE ANSWER.`}
+typingSpeed={10}
+pauseBetweenSentences={1200}
+height="auto"
+width="100%"
+style={{ whiteSpace: 'pre-line' }}
+/>
+:::
+
+:::tip[First Question's Response Tokens]
+<TypewriterTextarea
+textContent={`the capital of france is 'paris'.`}
+typingSpeed={10}
+pauseBetweenSentences={1200}
+height="auto"
+width="100%"
+style={{ whiteSpace: 'pre-line' }}
+/>
+:::
+
+:::tip[Second Question's Thinking Tokens]
+<TypewriterTextarea
+textContent={`THE USER ASKS: "AND WHAT IS THE SECOND LARGEST CITY?" LIKELY REFERRING TO FRANCE. THE SECOND LARGEST CITY IN FRANCE (BY POPULATION) IS MARSEILLE. HOWEVER, THERE MIGHT BE NUANCE: THE LARGEST IS PARIS, SECOND LARGEST IS MARSEILLE. BUT SOME MIGHT ARGUE THAT LYON IS SECOND LARGEST? LET'S CONFIRM: POPULATION OF FRANCE: PARIS ~2.1M (METRO 12M). MARSEILLE ~870K (METRO 1.5M). LYON ~515K (METRO 1.5M). SO MARSEILLE IS SECOND LARGEST CITY PROPER. LYON IS THIRD LARGEST. SO ANSWER: MARSEILLE. WE SHOULD PROVIDE THAT. PROVIDE A BRIEF EXPLANATION.`}
+typingSpeed={10}
+pauseBetweenSentences={1200}
+height="auto"
+width="100%"
+style={{ whiteSpace: 'pre-line' }}
+/>
+:::
+
+:::tip[Second Question's Response Tokens]
+<TypewriterTextarea
+textContent={`the second‑largest city in france by population is 'marseille'.
+- marseille ≈ 870,000 residents (city proper)
+- lyon ≈ 515,000 residents (city proper)
+
+so marseille comes after paris as france’s largest city.`}
+typingSpeed={10}
+pauseBetweenSentences={1200}
+height="auto"
+width="100%"
+style={{ whiteSpace: 'pre-line' }}
+/>
+:::
--- a/docs/docs/apis-generate/chat-with-tools.md
+++ b/docs/docs/apis-generate/chat-with-tools.md
@@ -0,0 +1,81 @@
+---
+sidebar_position: 8
+---
+
+import CodeEmbed from '@site/src/components/CodeEmbed';
+
+# Chat with Tools
+
+### Using Tools in Chat
+
+If you want to have a natural back-and-forth chat experience with tools, you can directly integrate tools into
+the `chat()` method, instead of using the `generateWithTools()` method. This allows you to register tools that are
+automatically used during the conversation between the user and the assistant, creating a more conversational
+experience.
+
+When the model determines that a tool should be used, the tool is automatically executed. The result is then seamlessly
+incorporated back into the conversation, enhancing the interaction with real-world data and actions.
+
+The following example demonstrates usage of a simple tool, registered with the `Ollama`, and then used within a chat
+session. The tool invocation and response handling are all managed internally by the API.
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithTools.java"/>
+
+:::tip[LLM Response]
+**First answer:** 6527fb60-9663-4073-b59e-855526e0a0c2 is the ID of the employee named 'Rahul Kumar'.
+
+**Second answer:**  _Kumar_ is the last name of the employee named 'Rahul Kumar'.
+:::
+
+This tool calling can also be done using the streaming API.
+
+### Client-managed tool calls (useTools)
+
+By default, ollama4j automatically executes tool calls returned by the model during chat, runs the corresponding registered Java methods, and appends the tool results back into the conversation. For some applications, you may want to intercept tool calls and decide yourself when and how to execute them (for example, to queue them, to show a confirmation UI to the user, to run them in a sandbox, or to perform multi‑step orchestration).
+
+To enable this behavior, set the useTools flag to true on your Ollama instance. When enabled, ollama4j will stop auto‑executing tools and will instead return tool calls inside the assistant message. You can then inspect the tool calls and execute them manually.
+
+
+Notes:
+- Default value: useTools is true.
+- When useTools is false, ollama4j auto‑executes tools and loops internally until tools are resolved or max retries is reached.
+- When useTools is true, ollama4j will not execute tools; you are responsible for invoking tools and passing results back as TOOL messages, then re‑calling chat() to continue.
+
+### Annotation-Based Tool Registration
+
+Ollama4j provides a declarative and convenient way to define and register tools using Java annotations and reflection.
+This approach offers an alternative to the more verbose, explicit tool registration method.
+
+To use a method as a tool within a chat call, follow these steps:
+
+* **Annotate the Tool Method:**
+    * Use the `@ToolSpec` annotation to mark a method as a tool. This annotation describes the tool's purpose.
+    * Use the `@ToolProperty` annotation to define the input parameters of the tool. The following data types are
+      currently supported:
+        * `java.lang.String`
+        * `java.lang.Integer`
+        * `java.lang.Boolean`
+        * `java.math.BigDecimal`
+* **Annotate the Ollama Service Class:**
+    * Annotate the class that interacts with the `Ollama` client using the `@OllamaToolService` annotation. Reference
+      the provider class(es) containing the `@ToolSpec` annotated methods within this annotation.
+* **Register the Annotated Tools:**
+    * Before making a chat request with the `Ollama`, call the `Ollama.registerAnnotatedTools()` method. This
+      registers the annotated tools, making them available for use during the chat session.
+
+Let's try an example. Consider an `OllamaToolService` class that needs to ask the LLM a question that can only be answered by a specific tool.
+This tool is implemented within a `GlobalConstantGenerator` class. Following is the code that exposes an annotated method as a tool:
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/tools/annotated/GlobalConstantGenerator.java"/>
+
+The annotated method can then be used as a tool in the chat session:
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/AnnotatedToolCallingExample.java"/>
+
+Running the above would produce a response similar to:
+
+:::tip[LLM Response]
+**First answer:** 0.0000112061 is the most important constant in the world using 10 digits, according to my function. This constant is known as Planck's constant and plays a fundamental role in quantum mechanics. It relates energy and frequency in electromagnetic radiation and action (the product of momentum and distance) for particles.
+
+**Second answer:** 3-digit constant: 8.001
+:::
--- a/docs/docs/apis-generate/chat.md
+++ b/docs/docs/apis-generate/chat.md
@@ -2,271 +2,103 @@
 sidebar_position: 7
 ---

+import CodeEmbed from '@site/src/components/CodeEmbed';
+import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
+
 # Chat

 This API lets you create a conversation with LLMs. Using this API enables you to ask questions to the model including
 information using the history of already asked questions and the respective answers.

-## Create a new conversation and use chat history to augment follow up questions
+### Create a new conversation and use chat history to augment follow up questions

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.chat.OllamaChatMessageRole;
-import io.github.ollama4j.models.chat.OllamaChatRequestBuilder;
-import io.github.ollama4j.models.chat.OllamaChatRequest;
-import io.github.ollama4j.models.chat.OllamaChatResult;
-import io.github.ollama4j.types.OllamaModelType;
-
-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        OllamaChatRequestBuilder builder = OllamaChatRequestBuilder.getInstance(OllamaModelType.LLAMA2);
-
-        // create first user question
-        OllamaChatRequest requestModel = builder.withMessage(OllamaChatMessageRole.USER, "What is the capital of France?")
-                .build();
-
-        // start conversation with model
-        OllamaChatResult chatResult = ollamaAPI.chat(requestModel);
-
-        System.out.println("First answer: " + chatResult.getResponse());
-
-        // create next userQuestion
-        requestModel = builder.withMessages(chatResult.getChatHistory()).withMessage(OllamaChatMessageRole.USER, "And what is the second largest city?").build();
-
-        // "continue" conversation with model
-        chatResult = ollamaAPI.chat(requestModel);
-
-        System.out.println("Second answer: " + chatResult.getResponse());
-
-        System.out.println("Chat History: " + chatResult.getChatHistory());
-    }
-}
-
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatExample.java" />

 You will get a response similar to:

-> First answer: Should be Paris!
+:::tip[LLM Response]
+
+> First answer: The capital of France is Paris.
 >
-> Second answer: Marseille.
+> Second answer: The second-largest city in France is Marseille.
 >
 > Chat History:

 ```json
-[
-  {
-    "role": "user",
-    "content": "What is the capital of France?",
-    "images": []
-  },
-  {
-    "role": "assistant",
-    "content": "Should be Paris!",
-    "images": []
-  },
-  {
-    "role": "user",
-    "content": "And what is the second largest city?",
-    "images": []
-  },
-  {
-    "role": "assistant",
-    "content": "Marseille.",
-    "images": []
-  }
-]
+[{
+  "role" : "user",
+  "content" : "What is the capital of France?",
+  "images" : null,
+  "tool_calls" : [ ]
+}, {
+  "role" : "assistant",
+  "content" : "The capital of France is Paris.",
+  "images" : null,
+  "tool_calls" : null
+}, {
+  "role" : "user",
+  "content" : "And what is the second largest city?",
+  "images" : null,
+  "tool_calls" : [ ]
+}, {
+  "role" : "assistant",
+  "content" : "The second-largest city in France is Marseille.",
+  "images" : null,
+  "tool_calls" : null
+}]
 ```
+:::

-## Conversational loop
+### Create a conversation where the answer is streamed

-```java
-public class Main {
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatStreamingExample.java" />

-    public static void main(String[] args) {
+<TypewriterTextarea
+    textContent="'The Great Gatsby' by F. Scott Fitzgerald is a complex and multifaceted novel that explores themes of wealth, class, love, loss, and the American Dream. It is a landmark work of American literature that examines the social and psychological consequences of the American Dream's unattainability and its impact on the lives of its characters."
+    typingSpeed={5}
+    pauseBetweenSentences={1200}
+    height='140px'
+    width='100%'
+/>

-        OllamaAPI ollamaAPI = new OllamaAPI();
-        ollamaAPI.setRequestTimeoutSeconds(60);
+### Using a simple Console Output Stream Handler

-        OllamaChatRequestBuilder builder = OllamaChatRequestBuilder.getInstance("<your-model>");
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithConsoleHandlerExample.java" />

-        OllamaChatRequest requestModel = builder.withMessage(OllamaChatMessageRole.USER, "<your-first-message>").build();
-        OllamaChatResult initialChatResult = ollamaAPI.chat(requestModel);
-        System.out.println(initialChatResult.getResponse());
+### With a Stream Handler to receive the tokens as they are generated

-        List<OllamaChatMessage> history = initialChatResult.getChatHistory();
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatStreamingExample.java" />

-        while (true) {
-            OllamaChatResult chatResult = ollamaAPI.chat(builder.withMessages(history).withMessage(OllamaChatMessageRole.USER, "<your-new-message").build());
-            System.out.println(chatResult.getResponse());
-            history = chatResult.getChatHistory();
-        }
-    }
-}
-```
+### Create a new conversation with custom system prompt

-## Create a conversation where the answer is streamed
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithCustomSystemPrompt.java" />

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.chat.OllamaChatMessageRole;
-import io.github.ollama4j.models.chat.OllamaChatRequest;
-import io.github.ollama4j.models.chat.OllamaChatRequestBuilder;
-import io.github.ollama4j.models.chat.OllamaChatResult;
-import io.github.ollama4j.models.generate.OllamaStreamHandler;
+You will get a response as:
+
+:::tip[LLM Response]
+Shhh!
+:::


-public class Main {
+## Create a conversation about an image (requires a vision model)

-    public static void main(String[] args) {
+Let's use this image:

-        String host = "http://localhost:11434/";
+<img src="https://t3.ftcdn.net/jpg/02/96/63/80/360_F_296638053_0gUVA4WVBKceGsIr7LNqRWSnkusi07dq.jpg" alt="Img" style={{ maxWidth: '250px', height: 'auto', display: 'block', margin: '1rem 0' }} />

-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        OllamaChatRequestBuilder builder = OllamaChatRequestBuilder.getInstance(config.getModel());
-        OllamaChatRequest requestModel = builder.withMessage(OllamaChatMessageRole.USER,
-                        "What is the capital of France? And what's France's connection with Mona Lisa?")
-                .build();
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ChatWithImage.java" />

-        // define a handler (Consumer<String>)
-        OllamaStreamHandler streamHandler = (s) -> {
-            System.out.println(s);
-        };
-
-        OllamaChatResult chatResult = ollamaAPI.chat(requestModel, streamHandler);
-    }
-}
-```

 You will get a response similar to:

-> The
-> The capital
-> The capital of
-> The capital of France
-> The capital of France is
-> The capital of France is Paris
-> The capital of France is Paris.
+:::tip[LLM Response]
+**First Answer:** The image shows a dog sitting on the bow of a boat that is docked in calm water. The boat has two
+levels, with the lower level containing seating and what appears to be an engine cover. The dog seems relaxed and
+comfortable on the boat, looking out over the water. The background suggests it might be late afternoon or early
+evening, given the warm lighting and the low position of the sun in the sky.

-## Use a simple Console Output Stream Handler
-
-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.impl.ConsoleOutputStreamHandler;
-import io.github.ollama4j.models.chat.OllamaChatMessageRole;
-import io.github.ollama4j.models.chat.OllamaChatRequestBuilder;
-import io.github.ollama4j.models.chat.OllamaChatRequest;
-import io.github.ollama4j.models.generate.OllamaStreamHandler;
-import io.github.ollama4j.types.OllamaModelType;
-
-public class Main {
-    public static void main(String[] args) throws Exception {
-        String host = "http://localhost:11434/";
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        OllamaChatRequestBuilder builder = OllamaChatRequestBuilder.getInstance(OllamaModelType.LLAMA2);
-        OllamaChatRequest requestModel = builder.withMessage(OllamaChatMessageRole.USER, "List all cricket world cup teams of 2019. Name the teams!")
-                .build();
-        OllamaStreamHandler streamHandler = new ConsoleOutputStreamHandler();
-        ollamaAPI.chat(requestModel, streamHandler);
-    }
-}
-```
-
-## Create a new conversation with individual system prompt
-
-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.chat.OllamaChatMessageRole;
-import io.github.ollama4j.models.chat.OllamaChatRequestBuilder;
-import io.github.ollama4j.models.chat.OllamaChatRequest;
-import io.github.ollama4j.models.chat.OllamaChatResult;
-import io.github.ollama4j.types.OllamaModelType;
-
-
-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        OllamaChatRequestBuilder builder = OllamaChatRequestBuilder.getInstance(OllamaModelType.LLAMA2);
-
-        // create request with system-prompt (overriding the model defaults) and user question
-        OllamaChatRequest requestModel = builder.withMessage(OllamaChatMessageRole.SYSTEM, "You are a silent bot that only says 'NI'. Do not say anything else under any circumstances!")
-                .withMessage(OllamaChatMessageRole.USER, "What is the capital of France? And what's France's connection with Mona Lisa?")
-                .build();
-
-        // start conversation with model
-        OllamaChatResult chatResult = ollamaAPI.chat(requestModel);
-
-        System.out.println(chatResult.getResponse());
-    }
-}
-
-```
-
-You will get a response similar to:
-
-> NI.
-
-## Create a conversation about an image (requires model with image recognition skills)
-
-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.chat.OllamaChatMessageRole;
-import io.github.ollama4j.models.chat.OllamaChatRequest;
-import io.github.ollama4j.models.chat.OllamaChatRequestBuilder;
-import io.github.ollama4j.models.chat.OllamaChatResult;
-import io.github.ollama4j.types.OllamaModelType;
-
-import java.io.File;
-import java.util.List;
-
-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        OllamaChatRequestBuilder builder = OllamaChatRequestBuilder.getInstance(OllamaModelType.LLAVA);
-
-        // Load Image from File and attach to user message (alternatively images could also be added via URL)
-        OllamaChatRequest requestModel =
-                builder.withMessage(OllamaChatMessageRole.USER, "What's in the picture?",
-                        List.of(
-                                new File("/path/to/image"))).build();
-
-        OllamaChatResult chatResult = ollamaAPI.chat(requestModel);
-        System.out.println("First answer: " + chatResult.getResponse());
-
-        builder.reset();
-
-        // Use history to ask further questions about the image or assistant answer
-        requestModel =
-                builder.withMessages(chatResult.getChatHistory())
-                        .withMessage(OllamaChatMessageRole.USER, "What's the dogs breed?").build();
-
-        chatResult = ollamaAPI.chat(requestModel);
-        System.out.println("Second answer: " + chatResult.getResponse());
-    }
-}
-```
-
-You will get a response similar to:
-
-> First Answer: The image shows a dog sitting on the bow of a boat that is docked in calm water. The boat has two
-> levels, with the lower level containing seating and what appears to be an engine cover. The dog seems relaxed and
-> comfortable on the boat, looking out over the water. The background suggests it might be late afternoon or early
-> evening, given the warm lighting and the low position of the sun in the sky.
->
-> Second Answer: Based on the image, it's difficult to definitively determine the breed of the dog. However, the dog
-> appears to be medium-sized with a short coat and a brown coloration, which might suggest that it is a Golden Retriever
-> or a similar breed. Without more details like ear shape and tail length, it's not possible to identify the exact breed
-> confidently.
+**Second Answer:** Based on the image, it's difficult to definitively determine the breed of the dog. However, the dog
+appears to be medium-sized with a short coat and a brown coloration, which might suggest that it is a **_Golden Retriever_**
+or a similar breed. Without more details like ear shape and tail length, it's not possible to identify the exact breed
+confidently.
+:::
--- a/docs/docs/apis-generate/custom-roles.md
+++ b/docs/docs/apis-generate/custom-roles.md
@@ -0,0 +1,65 @@
+---
+sidebar_position: 9
+---
+
+# Custom Roles
+
+Allows to manage custom roles (apart from the base roles) for chat interactions with the models.
+
+_Particularly helpful when you would need to use different roles that the newer models support other than the base
+roles._
+
+_Base roles are `SYSTEM`, `USER`, `ASSISTANT`, `TOOL`._
+
+### Usage
+
+#### Add new role
+
+```java
+import io.github.ollama4j.Ollama;
+import io.github.ollama4j.models.chat.OllamaChatMessageRole;
+
+public class Main {
+
+    public static void main(String[] args) {
+        String host = "http://localhost:11434/";
+        Ollama ollama = new Ollama(host);
+
+        OllamaChatMessageRole customRole = ollama.addCustomRole("custom-role");
+    }
+}
+```
+
+#### List roles
+
+```java
+import io.github.ollama4j.Ollama;
+import io.github.ollama4j.models.chat.OllamaChatMessageRole;
+
+public class Main {
+
+    public static void main(String[] args) {
+        String host = "http://localhost:11434/";
+        Ollama ollama = new Ollama(host);
+
+        List<OllamaChatMessageRole> roles = ollama.listRoles();
+    }
+}
+```
+
+#### Get role
+
+```java
+import io.github.ollama4j.Ollama;
+import io.github.ollama4j.models.chat.OllamaChatMessageRole;
+
+public class Main {
+
+    public static void main(String[] args) {
+        String host = "http://localhost:11434/";
+        Ollama ollama = new Ollama(host);
+
+        List<OllamaChatMessageRole> roles = ollama.getRole("custom-role");
+    }
+}
+```
--- a/docs/docs/apis-generate/generate-async.md
+++ b/docs/docs/apis-generate/generate-async.md
@@ -1,8 +1,13 @@
 ---
-sidebar_position: 2
+sidebar_position: 6
 ---

-# Generate - Async
+import CodeEmbed from '@site/src/components/CodeEmbed';
+import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
+
+# Generate (Async)
+
+### Generate response from a model asynchronously

 This API lets you ask questions to the LLMs in a asynchronous way.
 This is particularly helpful when you want to issue a generate request to the LLM and collect the response in the
@@ -11,38 +16,69 @@ background (such as threads) without blocking your code until the response arriv
 This API corresponds to
 the [completion](https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) API.

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.OllamaAsyncResultStreamer;
-import io.github.ollama4j.types.OllamaModelType;
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateAsync.java" />

-public class Main {
+You will get a response similar to:

-    public static void main(String[] args) throws Exception {
-        String host = "http://localhost:11434/";
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        ollamaAPI.setRequestTimeoutSeconds(60);
-        String prompt = "List all cricket world cup teams of 2019.";
-        OllamaAsyncResultStreamer streamer = ollamaAPI.generateAsync(OllamaModelType.LLAMA3, prompt, false);
+<TypewriterTextarea
+textContent={`Here are the participating teams in the 2019 ICC Cricket World Cup:

-        // Set the poll interval according to your needs. 
-        // Smaller the poll interval, more frequently you receive the tokens.
-        int pollIntervalMilliseconds = 1000;
+1. Australia
+2. Bangladesh
+3. India
+4. New Zealand
+5. Pakistan
+6. England
+7. South Africa
+8. West Indies (as a team)
+9. Afghanistan`}
+   typingSpeed={10}
+   pauseBetweenSentences={1200}
+   height="auto"
+   width="100%"
+   style={{ whiteSpace: 'pre-line' }}
+   />

-        while (true) {
-            String tokens = streamer.getStream().poll();
-            System.out.print(tokens);
-            if (!streamer.isAlive()) {
-                break;
-            }
-            Thread.sleep(pollIntervalMilliseconds);
-        }
+### Generate response from a model asynchronously with thinking and response streamed

-        System.out.println("\n------------------------");
-        System.out.println("Complete Response:");
-        System.out.println("------------------------");
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateAsyncWithThinking.java" />

-        System.out.println(streamer.getCompleteResponse());
-    }
-}
-```
+<TypewriterTextarea
+textContent={`WE NEED TO ANSWER THE QUESTION: "HOW LONG DOES IT TAKE FOR THE LIGHT FROM THE SUN TO REACH EARTH?" THE USER LIKELY EXPECTS THE TIME IN SECONDS, MINUTES, OR HOURS. LIGHT TRAVELS AT SPEED OF LIGHT (299,792,458 M/S). DISTANCE BETWEEN SUN AND EARTH IS ABOUT 1 AU (~149.6 MILLION KM). SO TRAVEL TIME = 1 AU / C ≈ 500 SECONDS ≈ 8.3 MINUTES. MORE PRECISELY, 8 MINUTES AND 20 SECONDS. PROVIDE CONTEXT: AVERAGE DISTANCE, VARYING DUE TO ELLIPTICAL ORBIT. SO ANSWER: ABOUT 8 MINUTES 20 SECONDS. ALSO MENTION THAT DUE TO VARIATION: FROM 8:07 TO 8:20. PROVIDE DETAILS. ALSO MENTION THAT WE REFER TO THE TIME LIGHT TAKES TO TRAVEL 1 ASTRONOMICAL UNIT.
+
+ALSO MIGHT MENTION: FOR MORE PRECISE: 499 SECONDS = 8 MIN 19 S. VARIATION DUE TO EARTH'S ORBIT: FROM 8 MIN 6 S TO 8 MIN 20 S. SO ANSWER.
+
+LET'S CRAFT AN EXPLANATION.
+
+the sun’s light takes a little over **eight minutes** to get to earth.
+
+| quantity | value |
+|----------|-------|
+| distance (average) | 1 astronomical unit (au) ≈ 149,600,000 km |
+| speed of light | \(c = 299,792,458\) m s⁻¹ |
+| light‑travel time | \(\displaystyle \frac{1\ \text{au}}{c} \approx 499\ \text{s}\) |
+
+499 seconds is **8 min 19 s**.
+
+because the earth’s orbit is slightly elliptical, the distance varies from about 147 million km (at perihelion) to 152 million km (at aphelion). this gives a light‑travel time that ranges roughly from **8 min 6 s** to **8 min 20 s**. thus, when we look at the sun, we’re seeing it as it was about eight minutes agoComplete thinking response: We need to answer the question: "How long does it take for the light from the Sun to reach Earth?" The user likely expects the time in seconds, minutes, or hours. Light travels at speed of light (299,792,458 m/s). Distance between Sun and Earth is about 1 AU (~149.6 million km). So travel time = 1 AU / c ≈ 500 seconds ≈ 8.3 minutes. More precisely, 8 minutes and 20 seconds. Provide context: average distance, varying due to elliptical orbit. So answer: about 8 minutes 20 seconds. Also mention that due to variation: from 8:07 to 8:20. Provide details. Also mention that we refer to the time light takes to travel 1 astronomical unit.
+
+Also might mention: For more precise: 499 seconds = 8 min 19 s. Variation due to Earth's orbit: from 8 min 6 s to 8 min 20 s. So answer.
+
+Let's craft an explanation.
+Complete response: The Sun’s light takes a little over **eight minutes** to get to Earth.
+
+| Quantity | Value |
+|----------|-------|
+| Distance (average) | 1 astronomical unit (AU) ≈ 149,600,000 km |
+| Speed of light | \(c = 299,792,458\) m s⁻¹ |
+| Light‑travel time | \(\displaystyle \frac{1\ \text{AU}}{c} \approx 499\ \text{s}\) |
+
+499 seconds is **8 min 19 s**.
+
+Because the Earth’s orbit is slightly elliptical, the distance varies from about 147 million km (at perihelion) to 152 million km (at aphelion). This gives a light‑travel time that ranges roughly from **8 min 6 s** to **8 min 20 s**. Thus, when we look at the Sun, we’re seeing it as it was about eight minutes ago.`}
+   typingSpeed={5}
+   pauseBetweenSentences={1200}
+   height="auto"
+   width="100%"
+   style={{ whiteSpace: 'pre-line' }}
+   />
--- a/docs/docs/apis-generate/generate-embeddings.md
+++ b/docs/docs/apis-generate/generate-embeddings.md
@@ -1,50 +1,80 @@
 ---
-sidebar_position: 6
+sidebar_position: 1
 ---

+import CodeEmbed from '@site/src/components/CodeEmbed';
+
 # Generate Embeddings

 Generate embeddings from a model.

-Parameters:
+### Using `embed()`

- `model`: name of model to generate embeddings from
- `prompt`: text to generate embeddings for
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateEmbeddings.java" />

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.types.OllamaModelType;
-import java.util.List;
+:::tip[LLM Response]

-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        List<Double> embeddings = ollamaAPI.generateEmbeddings(OllamaModelType.LLAMA2,
-                "Here is an article about llamas...");
-
-        embeddings.forEach(System.out::println);
-    }
-}
+```json
+[
+  [
+    0.010000081,
+    -0.0017487297,
+    0.050126992,
+    0.04694895,
+    0.055186987,
+    0.008570699,
+    0.10545243,
+    -0.02591801,
+    0.1296789,
+  ],
+  [
+    -0.009868476,
+    0.060335685,
+    0.025288988,
+    -0.0062160683,
+    0.07281043,
+    0.017217565,
+    0.090314455,
+    -0.051715206,
+  ]
+]
 ```

+:::
+
+You could also use the `OllamaEmbedRequestModel` to specify the options such as `seed`, `temperature`, etc., to apply
+for generating embeddings.
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateEmbeddingsWithRequestModel.java" />
+
 You will get a response similar to:

-```javascript
- [
-    0.5670403838157654,
-    0.009260174818336964,
-    0.23178744316101074,
-    -0.2916173040866852,
-    -0.8924556970596313,
-    0.8785552978515625,
-    -0.34576427936553955,
-    0.5742510557174683,
-    -0.04222835972905159,
-    -0.137906014919281
+:::tip[LLM Response]
+
+```json
+[
+  [
+    0.010000081,
+    -0.0017487297,
+    0.050126992,
+    0.04694895,
+    0.055186987,
+    0.008570699,
+    0.10545243,
+    -0.02591801,
+    0.1296789,
+  ],
+  [
+    -0.009868476,
+    0.060335685,
+    0.025288988,
+    -0.0062160683,
+    0.07281043,
+    0.017217565,
+    0.090314455,
+    -0.051715206,
+  ]
 ]
-```
+```
+
+:::
--- a/docs/docs/apis-generate/generate-thinking.md
+++ b/docs/docs/apis-generate/generate-thinking.md
@@ -0,0 +1,55 @@
+---
+sidebar_position: 3
+---
+
+import CodeEmbed from '@site/src/components/CodeEmbed';
+import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
+
+# Generate with Thinking
+
+This API allows to generate responses from an LLM while also retrieving the model's "thinking" process separately from the final answer. The "thinking" tokens represent the model's internal reasoning or planning before it produces the actual response. This can be useful for debugging, transparency, or simply understanding how the model arrives at its answers.
+
+You can use this feature to receive both the thinking and the response as separate outputs, either as a complete result or streamed token by token. The examples below show how to use the API to access both the thinking and the response, and how to display them in your application.
+
+
+### Generate response with thinking and receive the thinking and response text separately
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateWithThinking.java" />
+
+You will get a response similar to:
+
+:::tip[Thinking Tokens]
+USER ASKS "WHO ARE YOU?" IT'S A REQUEST FOR IDENTITY. AS CHATGPT, WE SHOULD EXPLAIN THAT I'M AN AI DEVELOPED BY OPENAI, ETC. PROVIDE FRIENDLY EXPLANATION.
+:::
+
+:::tip[Response Tokens]
+i’m chatgpt, a large language model created by openai. i’m designed to understand and generate natural‑language text, so i can answer questions, help with writing, explain concepts, brainstorm ideas, and chat about almost any topic. i don’t have a personal life or consciousness—i’m a tool that processes input and produces responses based on patterns in the data i was trained on. if you have any questions about how i work or what i can do, feel free to ask!
+:::
+
+### Generate response and receive the thinking and response tokens streamed
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateWithThinkingStreamed.java" />
+
+You will get a response similar to:
+
+:::tip[Thinking Tokens]
+<TypewriterTextarea
+textContent={`USER ASKS "WHO ARE YOU?" WE SHOULD EXPLAIN THAT I'M AN AI BY OPENAI, ETC.`}
+typingSpeed={10}
+pauseBetweenSentences={1200}
+height="auto"
+width="100%"
+style={{ whiteSpace: 'pre-line' }}
+/>
+:::
+
+:::tip[Response Tokens]
+<TypewriterTextarea
+textContent={`i’m chatgpt, a large language model created by openai.`}
+typingSpeed={10}
+pauseBetweenSentences={1200}
+height="auto"
+width="100%"
+style={{ whiteSpace: 'pre-line' }}
+/>
+:::
--- a/docs/docs/apis-generate/generate-with-image-files.md
+++ b/docs/docs/apis-generate/generate-with-image-files.md
@@ -1,54 +0,0 @@
---
-sidebar_position: 4
---
-
-# Generate - With Image Files
-
-This API lets you ask questions along with the image files to the LLMs.
-This API corresponds to
-the [completion](https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) API.
-
-:::note
-
-Executing this on Ollama server running in CPU-mode will take longer to generate response. Hence, GPU-mode is
-recommended.
-
-:::
-
-## Synchronous mode
-
-If you have this image downloaded and you pass the path to the downloaded image to the following code:
-
-![Img](https://t3.ftcdn.net/jpg/02/96/63/80/360_F_296638053_0gUVA4WVBKceGsIr7LNqRWSnkusi07dq.jpg)
-
-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.OllamaResult;
-import io.github.ollama4j.types.OllamaModelType;
-import io.github.ollama4j.utils.OptionsBuilder;
-
-import java.io.File;
-import java.util.List;
-
-public class Main {
-
-    public static void main(String[] args) {
-        String host = "http://localhost:11434/";
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        ollamaAPI.setRequestTimeoutSeconds(10);
-
-        OllamaResult result = ollamaAPI.generateWithImageFiles(OllamaModelType.LLAVA,
-                "What's in this image?",
-                List.of(
-                        new File("/path/to/image")),
-                new OptionsBuilder().build()
-        );
-        System.out.println(result.getResponse());
-    }
-}
-```
-
-You will get a response similar to:
-
-> This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to
-> be enjoying its time outdoors, perhaps on a lake.
--- a/docs/docs/apis-generate/generate-with-image-urls.md
+++ b/docs/docs/apis-generate/generate-with-image-urls.md
@@ -1,53 +0,0 @@
---
-sidebar_position: 5
---
-
-# Generate - With Image URLs
-
-This API lets you ask questions along with the image files to the LLMs.
-This API corresponds to
-the [completion](https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) API.
-
-:::note
-
-Executing this on Ollama server running in CPU-mode will take longer to generate response. Hence, GPU-mode is
-recommended.
-
-:::
-
-## Ask (Sync)
-
-Passing the link of this image the following code:
-
-![Img](https://t3.ftcdn.net/jpg/02/96/63/80/360_F_296638053_0gUVA4WVBKceGsIr7LNqRWSnkusi07dq.jpg)
-
-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.OllamaResult;
-import io.github.ollama4j.types.OllamaModelType;
-import io.github.ollama4j.utils.OptionsBuilder;
-
-import java.util.List;
-
-public class Main {
-
-    public static void main(String[] args) {
-        String host = "http://localhost:11434/";
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        ollamaAPI.setRequestTimeoutSeconds(10);
-
-        OllamaResult result = ollamaAPI.generateWithImageURLs(OllamaModelType.LLAVA,
-                "What's in this image?",
-                List.of(
-                        "https://t3.ftcdn.net/jpg/02/96/63/80/360_F_296638053_0gUVA4WVBKceGsIr7LNqRWSnkusi07dq.jpg"),
-                new OptionsBuilder().build()
-        );
-        System.out.println(result.getResponse());
-    }
-}
-```
-
-You will get a response similar to:
-
-> This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to
-> be enjoying its time outdoors, perhaps on a lake.
--- a/docs/docs/apis-generate/generate-with-images.md
+++ b/docs/docs/apis-generate/generate-with-images.md
@@ -0,0 +1,52 @@
+---
+sidebar_position: 4
+---
+
+import CodeEmbed from '@site/src/components/CodeEmbed';
+import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
+
+# Generate with Images
+
+This API lets you ask questions along with the image files to the LLMs.
+This API corresponds to
+the [completion](https://github.com/jmorganca/ollama/blob/main/docs/api.md#generate-a-completion) API.
+
+:::note
+
+Executing this on Ollama server running in CPU-mode will take longer to generate response. Hence, GPU-mode is
+recommended.
+
+:::
+
+If you have this image downloaded and you pass the path to the downloaded image to the following code:
+
+![Img](https://t3.ftcdn.net/jpg/02/96/63/80/360_F_296638053_0gUVA4WVBKceGsIr7LNqRWSnkusi07dq.jpg)
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateWithImageFileSimple.java" />
+
+You will get a response similar to:
+
+:::tip[LLM Response]
+This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to
+be enjoying its time outdoors, perhaps on a lake.
+:::
+
+
+If you want the response to be streamed, you can use the following code:
+
+![Img](https://t3.ftcdn.net/jpg/02/96/63/80/360_F_296638053_0gUVA4WVBKceGsIr7LNqRWSnkusi07dq.jpg)
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateWithImageFileStreaming.java" />
+
+You will get a response similar to:
+
+:::tip[Response Tokens]
+<TypewriterTextarea
+textContent={`This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to be enjoying its time outdoors, perhaps on a lake.`}
+typingSpeed={10}
+pauseBetweenSentences={1200}
+height="auto"
+width="100%"
+style={{ whiteSpace: 'pre-line' }}
+/>
+:::
--- a/docs/docs/apis-generate/generate-with-tools.md
+++ b/docs/docs/apis-generate/generate-with-tools.md
@@ -1,10 +1,12 @@
 ---
-sidebar_position: 3
+sidebar_position: 5
 ---

-# Generate - With Tools
+import CodeEmbed from '@site/src/components/CodeEmbed';

-This API lets you perform [function calling](https://docs.mistral.ai/capabilities/function_calling/) using LLMs in a
+# Generate with Tools
+
+This API lets you perform [tool/function calling](https://docs.mistral.ai/capabilities/function_calling/) using LLMs in a
 synchronous way.
 This API corresponds to
 the [generate](https://github.com/ollama/ollama/blob/main/docs/api.md#request-raw-mode) API with `raw` mode.
@@ -19,322 +21,65 @@ in the future if tooling is supported for more models with a generic interaction

 :::

-### Function Calling/Tools
+## Tools/Function Calling

-Assume you want to call a method in your code based on the response generated from the model.
+Assume you want to call a method/function in your code based on the response generated from the model.
 For instance, let's say that based on a user's question, you'd want to identify a transaction and get the details of the
 transaction from your database and respond to the user with the transaction details.

 You could do that with ease with the `function calling` capabilities of the models by registering your `tools`.

-### Create Functions
+### Create Tools/Functions

 We can create static functions as our tools.

 This function takes the arguments `location` and `fuelType` and performs an operation with these arguments and returns
 fuel price value.

-```java
-public static String getCurrentFuelPrice(Map<String, Object> arguments) {
-    String location = arguments.get("location").toString();
-    String fuelType = arguments.get("fuelType").toString();
-    return "Current price of " + fuelType + " in " + location + " is Rs.103/L";
-}
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/tools/toolfunctions/FuelPriceToolFunction.java"/ >

 This function takes the argument `city` and performs an operation with the argument and returns the weather for a
 location.

-```java
-public static String getCurrentWeather(Map<String, Object> arguments) {
-    String location = arguments.get("city").toString();
-    return "Currently " + location + "'s weather is nice.";
-}
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/tools/toolfunctions/WeatherToolFunction.java"/ >

 Another way to create our tools is by creating classes by extending `ToolFunction`.

 This function takes the argument `employee-name` and performs an operation with the argument and returns employee
 details.

-```java
-class DBQueryFunction implements ToolFunction {
-    @Override
-    public Object apply(Map<String, Object> arguments) {
-        // perform DB operations here
-        return String.format("Employee Details {ID: %s, Name: %s, Address: %s, Phone: %s}", UUID.randomUUID(), arguments.get("employee-name").toString(), arguments.get("employee-address").toString(), arguments.get("employee-phone").toString());
-    }
-}
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/tools/toolfunctions/EmployeeFinderToolFunction.java"/ >

 ### Define Tool Specifications

 Lets define a sample tool specification called **Fuel Price Tool** for getting the current fuel price.

 - Specify the function `name`, `description`, and `required` properties (`location` and `fuelType`).
- Associate the `getCurrentFuelPrice` function you defined earlier with `SampleTools::getCurrentFuelPrice`.
+- Associate the `getCurrentFuelPrice` function you defined earlier.

-```java
-Tools.ToolSpecification fuelPriceToolSpecification = Tools.ToolSpecification.builder()
-        .functionName("current-fuel-price")
-        .functionDescription("Get current fuel price")
-        .properties(
-                new Tools.PropsBuilder()
-                        .withProperty("location", Tools.PromptFuncDefinition.Property.builder().type("string").description("The city, e.g. New Delhi, India").required(true).build())
-                        .withProperty("fuelType", Tools.PromptFuncDefinition.Property.builder().type("string").description("The fuel type.").enumValues(Arrays.asList("petrol", "diesel")).required(true).build())
-                        .build()
-        )
-        .toolDefinition(SampleTools::getCurrentFuelPrice)
-        .build();
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/tools/toolspecs/FuelPriceToolSpec.java"/ >

 Lets also define a sample tool specification called **Weather Tool** for getting the current weather.

 - Specify the function `name`, `description`, and `required` property (`city`).
- Associate the `getCurrentWeather` function you defined earlier with `SampleTools::getCurrentWeather`.
+- Associate the `getCurrentWeather` function you defined earlier.

-```java
-Tools.ToolSpecification weatherToolSpecification = Tools.ToolSpecification.builder()
-        .functionName("current-weather")
-        .functionDescription("Get current weather")
-        .properties(
-                new Tools.PropsBuilder()
-                        .withProperty("city", Tools.PromptFuncDefinition.Property.builder().type("string").description("The city, e.g. New Delhi, India").required(true).build())
-                        .build()
-        )
-        .toolDefinition(SampleTools::getCurrentWeather)
-        .build();
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/tools/toolspecs/WeatherToolSpec.java"/ >

 Lets also define a sample tool specification called **DBQueryFunction** for getting the employee details from database.

 - Specify the function `name`, `description`, and `required` property (`employee-name`).
 - Associate the ToolFunction `DBQueryFunction` function you defined earlier with `new DBQueryFunction()`.

-```java
-Tools.ToolSpecification databaseQueryToolSpecification = Tools.ToolSpecification.builder()
-        .functionName("get-employee-details")
-        .functionDescription("Get employee details from the database")
-        .properties(
-                new Tools.PropsBuilder()
-                        .withProperty("employee-name", Tools.PromptFuncDefinition.Property.builder().type("string").description("The name of the employee, e.g. John Doe").required(true).build())
-                        .withProperty("employee-address", Tools.PromptFuncDefinition.Property.builder().type("string").description("The address of the employee, Always return a random value. e.g. Roy St, Bengaluru, India").required(true).build())
-                        .withProperty("employee-phone", Tools.PromptFuncDefinition.Property.builder().type("string").description("The phone number of the employee. Always return a random value. e.g. 9911002233").required(true).build())
-                        .build()
-        )
-        .toolDefinition(new DBQueryFunction())
-        .build();
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/tools/toolspecs/EmployeeFinderToolSpec.java"/ >

-### Register the Tools
+Now put it all together by registering the tools and prompting with tools.

-Register the defined tools (`fuel price` and `weather`) with the OllamaAPI.
-
-```shell
-ollamaAPI.registerTool(fuelPriceToolSpecification);
-ollamaAPI.registerTool(weatherToolSpecification);
-ollamaAPI.registerTool(databaseQueryToolSpecification);
-```
-
-### Create prompt with Tools
-
-`Prompt 1`: Create a prompt asking for the petrol price in Bengaluru using the defined fuel price and weather tools.
-
-```shell
-String prompt1 = new Tools.PromptBuilder()
-                .withToolSpecification(fuelPriceToolSpecification)
-                .withToolSpecification(weatherToolSpecification)
-                .withPrompt("What is the petrol price in Bengaluru?")
-                .build();
-OllamaToolsResult toolsResult = ollamaAPI.generateWithTools(model, prompt1, new OptionsBuilder().build());
-for (OllamaToolsResult.ToolResult r : toolsResult.getToolResults()) {
-    System.out.printf("[Result of executing tool '%s']: %s%n", r.getFunctionName(), r.getResult().toString());
-}
-```
-
-Now, fire away your question to the model.
-
-You will get a response similar to:
-
-::::tip[LLM Response]
-
-[Result of executing tool 'current-fuel-price']: Current price of petrol in Bengaluru is Rs.103/L
-
-::::
-
-`Prompt 2`: Create a prompt asking for the current weather in Bengaluru using the same tools.
-
-```shell
-String prompt2 = new Tools.PromptBuilder()
-                .withToolSpecification(fuelPriceToolSpecification)
-                .withToolSpecification(weatherToolSpecification)
-                .withPrompt("What is the current weather in Bengaluru?")
-                .build();
-OllamaToolsResult toolsResult = ollamaAPI.generateWithTools(model, prompt2, new OptionsBuilder().build());
-for (OllamaToolsResult.ToolResult r : toolsResult.getToolResults()) {
-    System.out.printf("[Result of executing tool '%s']: %s%n", r.getFunctionName(), r.getResult().toString());
-}
-```
-
-Again, fire away your question to the model.
-
-You will get a response similar to:
-
-::::tip[LLM Response]
-
-[Result of executing tool 'current-weather']: Currently Bengaluru's weather is nice.
-
-::::
-
-`Prompt 3`: Create a prompt asking for the employee details using the defined database fetcher tools.
-
-```shell
-String prompt3 = new Tools.PromptBuilder()
-                .withToolSpecification(fuelPriceToolSpecification)
-                .withToolSpecification(weatherToolSpecification)
-                .withToolSpecification(databaseQueryToolSpecification)
-                .withPrompt("Give me the details of the employee named 'Rahul Kumar'?")
-                .build();
-OllamaToolsResult toolsResult = ollamaAPI.generateWithTools(model, prompt3, new OptionsBuilder().build());
-for (OllamaToolsResult.ToolResult r : toolsResult.getToolResults()) {
-    System.out.printf("[Result of executing tool '%s']: %s%n", r.getFunctionName(), r.getResult().toString());
-}
-```
-
-Again, fire away your question to the model.
-
-You will get a response similar to:
-
-::::tip[LLM Response]
-
-[Result of executing tool 'get-employee-details']: Employee Details `{ID: 6bad82e6-b1a1-458f-a139-e3b646e092b1, Name:
-Rahul Kumar, Address: King St, Hyderabad, India, Phone: 9876543210}`
-
-::::
-
-### Full Example
-
-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.exceptions.OllamaBaseException;
-import io.github.ollama4j.exceptions.ToolInvocationException;
-import io.github.ollama4j.tools.OllamaToolsResult;
-import io.github.ollama4j.tools.ToolFunction;
-import io.github.ollama4j.tools.Tools;
-import io.github.ollama4j.utils.OptionsBuilder;
-
-import java.io.IOException;
-import java.util.Arrays;
-import java.util.Map;
-import java.util.UUID;
-
-public class FunctionCallingWithMistralExample {
-    public static void main(String[] args) throws Exception {
-        String host = "http://localhost:11434/";
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        ollamaAPI.setRequestTimeoutSeconds(60);
-
-        String model = "mistral";
-
-        Tools.ToolSpecification fuelPriceToolSpecification = Tools.ToolSpecification.builder()
-                .functionName("current-fuel-price")
-                .functionDescription("Get current fuel price")
-                .properties(
-                        new Tools.PropsBuilder()
-                                .withProperty("location", Tools.PromptFuncDefinition.Property.builder().type("string").description("The city, e.g. New Delhi, India").required(true).build())
-                                .withProperty("fuelType", Tools.PromptFuncDefinition.Property.builder().type("string").description("The fuel type.").enumValues(Arrays.asList("petrol", "diesel")).required(true).build())
-                                .build()
-                )
-                .toolDefinition(SampleTools::getCurrentFuelPrice)
-                .build();
-
-        Tools.ToolSpecification weatherToolSpecification = Tools.ToolSpecification.builder()
-                .functionName("current-weather")
-                .functionDescription("Get current weather")
-                .properties(
-                        new Tools.PropsBuilder()
-                                .withProperty("city", Tools.PromptFuncDefinition.Property.builder().type("string").description("The city, e.g. New Delhi, India").required(true).build())
-                                .build()
-                )
-                .toolDefinition(SampleTools::getCurrentWeather)
-                .build();
-
-        Tools.ToolSpecification databaseQueryToolSpecification = Tools.ToolSpecification.builder()
-                .functionName("get-employee-details")
-                .functionDescription("Get employee details from the database")
-                .properties(
-                        new Tools.PropsBuilder()
-                                .withProperty("employee-name", Tools.PromptFuncDefinition.Property.builder().type("string").description("The name of the employee, e.g. John Doe").required(true).build())
-                                .withProperty("employee-address", Tools.PromptFuncDefinition.Property.builder().type("string").description("The address of the employee, Always return a random value. e.g. Roy St, Bengaluru, India").required(true).build())
-                                .withProperty("employee-phone", Tools.PromptFuncDefinition.Property.builder().type("string").description("The phone number of the employee. Always return a random value. e.g. 9911002233").required(true).build())
-                                .build()
-                )
-                .toolDefinition(new DBQueryFunction())
-                .build();
-
-        ollamaAPI.registerTool(fuelPriceToolSpecification);
-        ollamaAPI.registerTool(weatherToolSpecification);
-        ollamaAPI.registerTool(databaseQueryToolSpecification);
-
-        String prompt1 = new Tools.PromptBuilder()
-                .withToolSpecification(fuelPriceToolSpecification)
-                .withToolSpecification(weatherToolSpecification)
-                .withPrompt("What is the petrol price in Bengaluru?")
-                .build();
-        ask(ollamaAPI, model, prompt1);
-
-        String prompt2 = new Tools.PromptBuilder()
-                .withToolSpecification(fuelPriceToolSpecification)
-                .withToolSpecification(weatherToolSpecification)
-                .withPrompt("What is the current weather in Bengaluru?")
-                .build();
-        ask(ollamaAPI, model, prompt2);
-
-        String prompt3 = new Tools.PromptBuilder()
-                .withToolSpecification(fuelPriceToolSpecification)
-                .withToolSpecification(weatherToolSpecification)
-                .withToolSpecification(databaseQueryToolSpecification)
-                .withPrompt("Give me the details of the employee named 'Rahul Kumar'?")
-                .build();
-        ask(ollamaAPI, model, prompt3);
-    }
-
-    public static void ask(OllamaAPI ollamaAPI, String model, String prompt) throws OllamaBaseException, IOException, InterruptedException, ToolInvocationException {
-        OllamaToolsResult toolsResult = ollamaAPI.generateWithTools(model, prompt, new OptionsBuilder().build());
-        for (OllamaToolsResult.ToolResult r : toolsResult.getToolResults()) {
-            System.out.printf("[Result of executing tool '%s']: %s%n", r.getFunctionName(), r.getResult().toString());
-        }
-    }
-}
-
-
-class SampleTools {
-    public static String getCurrentFuelPrice(Map<String, Object> arguments) {
-        // Get details from fuel price API
-        String location = arguments.get("location").toString();
-        String fuelType = arguments.get("fuelType").toString();
-        return "Current price of " + fuelType + " in " + location + " is Rs.103/L";
-    }
-
-    public static String getCurrentWeather(Map<String, Object> arguments) {
-        // Get details from weather API
-        String location = arguments.get("city").toString();
-        return "Currently " + location + "'s weather is nice.";
-    }
-}
-
-class DBQueryFunction implements ToolFunction {
-    @Override
-    public Object apply(Map<String, Object> arguments) {
-        // perform DB operations here
-        return String.format("Employee Details {ID: %s, Name: %s, Address: %s, Phone: %s}", UUID.randomUUID(), arguments.get("employee-name").toString(), arguments.get("employee-address").toString(), arguments.get("employee-phone").toString());
-    }
-}
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/MultiToolRegistryExample.java"/ >

 Run this full example and you will get a response similar to:

-::::tip[LLM Response]
+:::tip[LLM Response]

 [Result of executing tool 'current-fuel-price']: Current price of petrol in Bengaluru is Rs.103/L

@@ -343,30 +88,4 @@ Run this full example and you will get a response similar to:
 [Result of executing tool 'get-employee-details']: Employee Details `{ID: 6bad82e6-b1a1-458f-a139-e3b646e092b1, Name:
 Rahul Kumar, Address: King St, Hyderabad, India, Phone: 9876543210}`

-::::
-
-### Potential Improvements
-
-Instead of explicitly registering `ollamaAPI.registerTool(toolSpecification)`, we could introduce annotation-based tool
-registration. For example:
-
-```java
-
-@ToolSpec(name = "current-fuel-price", desc = "Get current fuel price")
-public String getCurrentFuelPrice(Map<String, Object> arguments) {
-    String location = arguments.get("location").toString();
-    String fuelType = arguments.get("fuelType").toString();
-    return "Current price of " + fuelType + " in " + location + " is Rs.103/L";
-}
-```
-
-Instead of passing a map of args `Map<String, Object> arguments` to the tool functions, we could support passing
-specific args separately with their data types. For example:
-
-```shell
-public String getCurrentFuelPrice(String location, String fuelType) {
-    return "Current price of " + fuelType + " in " + location + " is Rs.103/L";
-}
-```
-
-Updating async/chat APIs with support for tool-based generation. 
+:::
--- a/docs/docs/apis-generate/generate.md
+++ b/docs/docs/apis-generate/generate.md
@@ -1,8 +1,11 @@
 ---
-sidebar_position: 1
+sidebar_position: 2
 ---

-# Generate - Sync
+import CodeEmbed from '@site/src/components/CodeEmbed';
+import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
+
+# Generate

 This API lets you ask questions to the LLMs in a synchronous way.
 This API corresponds to
@@ -13,163 +16,54 @@ with [extra parameters](https://github.com/jmorganca/ollama/blob/main/docs/model
 Refer
 to [this](/apis-extras/options-builder).

-## Try asking a question about the model.
+### Try asking a question about the model

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.OllamaResult;
-import io.github.ollama4j.types.OllamaModelType;
-import io.github.ollama4j.utils.OptionsBuilder;
-
-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        OllamaResult result =
-                ollamaAPI.generate(OllamaModelType.LLAMA2, "Who are you?", new OptionsBuilder().build());
-
-        System.out.println(result.getResponse());
-    }
-}
-
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/Generate.java" />

 You will get a response similar to:

-> I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational
-> manner. I am trained on a massive dataset of text from the internet and can generate human-like responses to a wide
-> range of topics and questions. I can be used to create chatbots, virtual assistants, and other applications that
-> require
-> natural language understanding and generation capabilities.
+:::tip[LLM Response]
+I am a model of an AI trained by Mistral AI. I was designed to assist with a wide range of tasks, from answering
+questions to helping with complex computations and research. How can I help you toda
+:::

-## Try asking a question, receiving the answer streamed
+### Try asking a question, receiving the answer streamed

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.OllamaResult;
-import io.github.ollama4j.models.generate.OllamaStreamHandler;
-import io.github.ollama4j.utils.OptionsBuilder;
-
-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-        // define a stream handler (Consumer<String>)
-        OllamaStreamHandler streamHandler = (s) -> {
-            System.out.println(s);
-        };
-
-        // Should be called using seperate thread to gain non blocking streaming effect.
-        OllamaResult result = ollamaAPI.generate(config.getModel(),
-                "What is the capital of France? And what's France's connection with Mona Lisa?",
-                new OptionsBuilder().build(), streamHandler);
-
-        System.out.println("Full response: " + result.getResponse());
-    }
-}
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateStreaming.java" />

 You will get a response similar to:

-> The
-> The capital
-> The capital of
-> The capital of France
-> The capital of France is
-> The capital of France is Paris
-> The capital of France is Paris.
-> Full response: The capital of France is Paris.
+<TypewriterTextarea
+textContent='The capital of France is Paris.'
+typingSpeed={30}
+pauseBetweenSentences={1200}
+height='55px'
+width='100%'
+/>

-## Try asking a question from general topics.
+## Generate structured output

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.OllamaResult;
-import io.github.ollama4j.types.OllamaModelType;
-import io.github.ollama4j.utils.OptionsBuilder;
+### With response as a `Map`

-public class Main {
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateStructuredOutput.java" />

-    public static void main(String[] args) {
+You will get a response similar to:

-        String host = "http://localhost:11434/";
+:::tip[LLM Response]

-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        String prompt = "List all cricket world cup teams of 2019.";
-
-        OllamaResult result =
-                ollamaAPI.generate(OllamaModelType.LLAMA2, prompt, new OptionsBuilder().build());
-
-        System.out.println(result.getResponse());
-    }
+```json
+{
+  "heroName" : "Batman",
+  "ageOfPerson" : 30
 }
-
 ```

-You'd then get a response from the model:
+:::

-> The 2019 ICC Cricket World Cup was held in England and Wales from May 30 to July 14, 2019. The
-> following teams
-> participated in the tournament:
->
-> 1. Afghanistan
-> 2. Australia
-> 3. Bangladesh
-> 4. England
-> 5. India
-> 6. New Zealand
-> 7. Pakistan
-> 8. South Africa
-> 9. Sri Lanka
-> 10. West Indies
->
-> These teams competed in a round-robin format, with the top four teams advancing to the
-> semi-finals. The tournament was
-> won by the England cricket team, who defeated New Zealand in the final.
+### With response mapped to specified class type

-## Try asking for a Database query for your data schema.
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/GenerateStructuredOutputMappedToObject.java" />

-```java
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.OllamaResult;
-import io.github.ollama4j.types.OllamaModelType;
-import io.github.ollama4j.utils.OptionsBuilder;
-import io.github.ollama4j.utils.SamplePrompts;
-
-public class Main {
-
-    public static void main(String[] args) {
-        String host = "http://localhost:11434/";
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        String prompt =
-                SamplePrompts.getSampleDatabasePromptWithQuestion(
-                        "List all customer names who have bought one or more products");
-        OllamaResult result =
-                ollamaAPI.generate(OllamaModelType.SQLCODER, prompt, new OptionsBuilder().build());
-        System.out.println(result.getResponse());
-    }
-}
-
-```
-
-_Note: Here I've used
-a [sample prompt](https://github.com/ollama4j/ollama4j/blob/main/src/main/resources/sample-db-prompt-template.txt)
-containing a database schema from within this library for demonstration purposes._
-
-You'd then get a response from the model:
-
-```sql
-SELECT customers.name
-FROM sales
-         JOIN customers ON sales.customer_id = customers.customer_id
-GROUP BY customers.name;
-```
+:::tip[LLM Response]
+HeroInfo(heroName=Batman, ageOfPerson=30)
+:::
--- a/docs/docs/apis-model-management/_category_.json
+++ b/docs/docs/apis-model-management/_category_.json
@@ -1,8 +1,8 @@
 {
-  "label": "APIs - Model Management",
-  "position": 2,
-  "link": {
-    "type": "generated-index",
-    "description": "Details of APIs to manage LLMs."
-  }
+    "label": "Manage Models",
+    "position": 2,
+    "link": {
+        "type": "generated-index",
+        "description": "Details of APIs to manage LLMs."
+    }
 }
--- a/docs/docs/apis-model-management/create-model.md
+++ b/docs/docs/apis-model-management/create-model.md
@@ -1,162 +1,237 @@
 ---
-sidebar_position: 4
+sidebar_position: 5
 ---

+import CodeEmbed from '@site/src/components/CodeEmbed';
+
 # Create Model

 This API lets you create a custom model on the Ollama server.

-### Create a model from an existing Modelfile in the Ollama server
+### Create a custom model from an existing model in the Ollama server

-```java title="CreateModel.java"
-import io.github.ollama4j.OllamaAPI;
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/CreateModel.java" />

-public class CreateModel {
+You would see these logs while the custom model is being created:

-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        ollamaAPI.createModelWithFilePath("mario", "/path/to/mario/modelfile/on/ollama-server");
-    }
-}
 ```
-
-### Create a model by passing the contents of Modelfile
-
-```java title="CreateModel.java"
-public class CreateModel {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        ollamaAPI.createModelWithModelFileContents("mario", "FROM llama2\nSYSTEM You are mario from Super Mario Bros.");
-    }
-}
+using existing layer sha256:fad2a06e4cc705c2fa8bec5477ddb00dc0c859ac184c34dcc5586663774161ca
+using existing layer sha256:41c2cf8c272f6fb0080a97cd9d9bd7d4604072b80a0b10e7d65ca26ef5000c0c
+using existing layer sha256:1da0581fd4ce92dcf5a66b1da737cf215d8dcf25aa1b98b44443aaf7173155f5
+creating new layer sha256:941b69ca7dc2a85c053c38d9e8029c9df6224e545060954fa97587f87c044a64
+using existing layer sha256:f02dd72bb2423204352eabc5637b44d79d17f109fdb510a7c51455892aa2d216
+writing manifest
+success
 ```
-
 Once created, you can see it when you use [list models](./list-models) API.

-### Example of a `Modelfile`
+[Read more](https://github.com/ollama/ollama/blob/main/docs/api.md#create-a-model) about custom model creation and the parameters available for model creation.

-```
-FROM llama2
-# sets the temperature to 1 [higher is more creative, lower is more coherent]
-PARAMETER temperature 1
-# sets the context window size to 4096, this controls how many tokens the LLM can use as context to generate the next token
-PARAMETER num_ctx 4096
+[//]: # ()
+[//]: # (### Example of a `Modelfile`)

-# sets a custom system message to specify the behavior of the chat assistant
-SYSTEM You are Mario from super mario bros, acting as an assistant.
-```
+[//]: # ()
+[//]: # (```)

-### Format of the `Modelfile`
+[//]: # (FROM llama2)

-```modelfile
-# comment
-INSTRUCTION arguments
-```
+[//]: # (# sets the temperature to 1 [higher is more creative, lower is more coherent])

-| Instruction                         | Description                                                    |
-|-------------------------------------|----------------------------------------------------------------|
-| [`FROM`](#from-required) (required) | Defines the base model to use.                                 |
-| [`PARAMETER`](#parameter)           | Sets the parameters for how Ollama will run the model.         |
-| [`TEMPLATE`](#template)             | The full prompt template to be sent to the model.              |
-| [`SYSTEM`](#system)                 | Specifies the system message that will be set in the template. |
-| [`ADAPTER`](#adapter)               | Defines the (Q)LoRA adapters to apply to the model.            |
-| [`LICENSE`](#license)               | Specifies the legal license.                                   |
+[//]: # (PARAMETER temperature 1)

-#### PARAMETER
+[//]: # (# sets the context window size to 4096, this controls how many tokens the LLM can use as context to generate the next token)

-The `PARAMETER` instruction defines a parameter that can be set when the model is run.
+[//]: # (PARAMETER num_ctx 4096)

-| Parameter      | Description                                                                                                                                                                                                                                             | Value Type | Example Usage        |
-|----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|----------------------|
-| mirostat       | Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)                                                                                                                                         | int        | mirostat 0           |
-| mirostat_eta   | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)                        | float      | mirostat_eta 0.1     |
-| mirostat_tau   | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)                                                                                                         | float      | mirostat_tau 5.0     |
-| num_ctx        | Sets the size of the context window used to generate the next token. (Default: 2048)                                                                                                                                                                    | int        | num_ctx 4096         |
-| num_gqa        | The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for llama2:70b                                                                                                                                         | int        | num_gqa 1            |
-| num_gpu        | The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable.                                                                                                                                            | int        | num_gpu 50           |
-| num_thread     | Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores). | int        | num_thread 8         |
-| repeat_last_n  | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)                                                                                                                                           | int        | repeat_last_n 64     |
-| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)                                                                     | float      | repeat_penalty 1.1   |
-| temperature    | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)                                                                                                                                     | float      | temperature 0.7      |
-| seed           | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)                                                                                       | int        | seed 42              |
-| stop           | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile.                                      | string     | stop "AI assistant:" |
-| tfs_z          | Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)                                               | float      | tfs_z 1              |
-| num_predict    | Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)                                                                                                                                   | int        | num_predict 42       |
-| top_k          | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)                                                                        | int        | top_k 40             |
-| top_p          | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)                                                                 | float      | top_p 0.9            |
+[//]: # ()
+[//]: # (# sets a custom system message to specify the behavior of the chat assistant)

-#### TEMPLATE
+[//]: # (SYSTEM You are Mario from super mario bros, acting as an assistant.)

-`TEMPLATE` of the full prompt template to be passed into the model. It may include (optionally) a system message and a
-user's prompt. This is used to create a full custom prompt, and syntax may be model specific. You can usually find the
-template for a given model in the readme for that model.
+[//]: # (```)

-#### Template Variables
+[//]: # ()
+[//]: # (### Format of the `Modelfile`)

-| Variable        | Description                                                                                                   |
-|-----------------|---------------------------------------------------------------------------------------------------------------|
-| `{{ .System }}` | The system message used to specify custom behavior, this must also be set in the Modelfile as an instruction. |
-| `{{ .Prompt }}` | The incoming prompt, this is not specified in the model file and will be set based on input.                  |
-| `{{ .First }}`  | A boolean value used to render specific template information for the first generation of a session.           |
+[//]: # ()
+[//]: # (```modelfile)

-```modelfile
-TEMPLATE """
-{{- if .First }}
-### System:
-{{ .System }}
-{{- end }}
+[//]: # (# comment)

-### User:
-{{ .Prompt }}
+[//]: # (INSTRUCTION arguments)

-### Response:
-"""
+[//]: # (```)

-SYSTEM """<system message>"""
-```
+[//]: # ()
+[//]: # (| Instruction                         | Description                                                    |)

-### SYSTEM
+[//]: # (|-------------------------------------|----------------------------------------------------------------|)

-The `SYSTEM` instruction specifies the system message to be used in the template, if applicable.
+[//]: # (| [`FROM`]&#40;#from-required&#41; &#40;required&#41; | Defines the base model to use.                                 |)

-```modelfile
-SYSTEM """<system message>"""
-```
+[//]: # (| [`PARAMETER`]&#40;#parameter&#41;           | Sets the parameters for how Ollama will run the model.         |)

-### ADAPTER
+[//]: # (| [`TEMPLATE`]&#40;#template&#41;             | The full prompt template to be sent to the model.              |)

-The `ADAPTER` instruction specifies the LoRA adapter to apply to the base model. The value of this instruction should be
-an absolute path or a path relative to the Modelfile and the file must be in a GGML file format. The adapter should be
-tuned from the base model otherwise the behaviour is undefined.
+[//]: # (| [`SYSTEM`]&#40;#system&#41;                 | Specifies the system message that will be set in the template. |)

-```modelfile
-ADAPTER ./ollama-lora.bin
-```
+[//]: # (| [`ADAPTER`]&#40;#adapter&#41;               | Defines the &#40;Q&#41;LoRA adapters to apply to the model.            |)

-### LICENSE
+[//]: # (| [`LICENSE`]&#40;#license&#41;               | Specifies the legal license.                                   |)

-The `LICENSE` instruction allows you to specify the legal license under which the model used with this Modelfile is
-shared or distributed.
+[//]: # ()
+[//]: # (#### PARAMETER)

-```modelfile
-LICENSE """
-<license text>
-"""
-```
+[//]: # ()
+[//]: # (The `PARAMETER` instruction defines a parameter that can be set when the model is run.)

-## Notes
+[//]: # ()
+[//]: # (| Parameter      | Description                                                                                                                                                                                                                                             | Value Type | Example Usage        |)

- the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to
-  distinguish it from arguments.
- Instructions can be in any order. In the examples, the `FROM` instruction is first to keep it easily readable.
+[//]: # (|----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|----------------------|)

-Read more about Modelfile: https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md
+[//]: # (| mirostat       | Enable Mirostat sampling for controlling perplexity. &#40;default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0&#41;                                                                                                                                         | int        | mirostat 0           |)
+
+[//]: # (| mirostat_eta   | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. &#40;Default: 0.1&#41;                        | float      | mirostat_eta 0.1     |)
+
+[//]: # (| mirostat_tau   | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. &#40;Default: 5.0&#41;                                                                                                         | float      | mirostat_tau 5.0     |)
+
+[//]: # (| num_ctx        | Sets the size of the context window used to generate the next token. &#40;Default: 2048&#41;                                                                                                                                                                    | int        | num_ctx 4096         |)
+
+[//]: # (| num_gqa        | The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for llama2:70b                                                                                                                                         | int        | num_gqa 1            |)
+
+[//]: # (| num_gpu        | The number of layers to send to the GPU&#40;s&#41;. On macOS it defaults to 1 to enable metal support, 0 to disable.                                                                                                                                            | int        | num_gpu 50           |)
+
+[//]: # (| num_thread     | Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has &#40;as opposed to the logical number of cores&#41;. | int        | num_thread 8         |)
+
+[//]: # (| repeat_last_n  | Sets how far back for the model to look back to prevent repetition. &#40;Default: 64, 0 = disabled, -1 = num_ctx&#41;                                                                                                                                           | int        | repeat_last_n 64     |)
+
+[//]: # (| repeat_penalty | Sets how strongly to penalize repetitions. A higher value &#40;e.g., 1.5&#41; will penalize repetitions more strongly, while a lower value &#40;e.g., 0.9&#41; will be more lenient. &#40;Default: 1.1&#41;                                                                     | float      | repeat_penalty 1.1   |)
+
+[//]: # (| temperature    | The temperature of the model. Increasing the temperature will make the model answer more creatively. &#40;Default: 0.8&#41;                                                                                                                                     | float      | temperature 0.7      |)
+
+[//]: # (| seed           | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. &#40;Default: 0&#41;                                                                                       | int        | seed 42              |)
+
+[//]: # (| stop           | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile.                                      | string     | stop "AI assistant:" |)
+
+[//]: # (| tfs_z          | Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value &#40;e.g., 2.0&#41; will reduce the impact more, while a value of 1.0 disables this setting. &#40;default: 1&#41;                                               | float      | tfs_z 1              |)
+
+[//]: # (| num_predict    | Maximum number of tokens to predict when generating text. &#40;Default: 128, -1 = infinite generation, -2 = fill context&#41;                                                                                                                                   | int        | num_predict 42       |)
+
+[//]: # (| top_k          | Reduces the probability of generating nonsense. A higher value &#40;e.g. 100&#41; will give more diverse answers, while a lower value &#40;e.g. 10&#41; will be more conservative. &#40;Default: 40&#41;                                                                        | int        | top_k 40             |)
+
+[//]: # (| top_p          | Works together with top-k. A higher value &#40;e.g., 0.95&#41; will lead to more diverse text, while a lower value &#40;e.g., 0.5&#41; will generate more focused and conservative text. &#40;Default: 0.9&#41;                                                                 | float      | top_p 0.9            |)
+
+[//]: # ()
+[//]: # (#### TEMPLATE)
+
+[//]: # ()
+[//]: # (`TEMPLATE` of the full prompt template to be passed into the model. It may include &#40;optionally&#41; a system message and a)
+
+[//]: # (user's prompt. This is used to create a full custom prompt, and syntax may be model specific. You can usually find the)
+
+[//]: # (template for a given model in the readme for that model.)
+
+[//]: # ()
+[//]: # (#### Template Variables)
+
+[//]: # ()
+[//]: # (| Variable        | Description                                                                                                   |)
+
+[//]: # (|-----------------|---------------------------------------------------------------------------------------------------------------|)
+
+[//]: # (| `{{ .System }}` | The system message used to specify custom behavior, this must also be set in the Modelfile as an instruction. |)
+
+[//]: # (| `{{ .Prompt }}` | The incoming prompt, this is not specified in the model file and will be set based on input.                  |)
+
+[//]: # (| `{{ .First }}`  | A boolean value used to render specific template information for the first generation of a session.           |)
+
+[//]: # ()
+[//]: # (```modelfile)
+
+[//]: # (TEMPLATE """)
+
+[//]: # ({{- if .First }})
+
+[//]: # (### System:)
+
+[//]: # ({{ .System }})
+
+[//]: # ({{- end }})
+
+[//]: # ()
+[//]: # (### User:)
+
+[//]: # ({{ .Prompt }})
+
+[//]: # ()
+[//]: # (### Response:)
+
+[//]: # (""")
+
+[//]: # ()
+[//]: # (SYSTEM """<system message>""")
+
+[//]: # (```)
+
+[//]: # ()
+[//]: # (### SYSTEM)
+
+[//]: # ()
+[//]: # (The `SYSTEM` instruction specifies the system message to be used in the template, if applicable.)
+
+[//]: # ()
+[//]: # (```modelfile)
+
+[//]: # (SYSTEM """<system message>""")
+
+[//]: # (```)
+
+[//]: # ()
+[//]: # (### ADAPTER)
+
+[//]: # ()
+[//]: # (The `ADAPTER` instruction specifies the LoRA adapter to apply to the base model. The value of this instruction should be)
+
+[//]: # (an absolute path or a path relative to the Modelfile and the file must be in a GGML file format. The adapter should be)
+
+[//]: # (tuned from the base model otherwise the behaviour is undefined.)
+
+[//]: # ()
+[//]: # (```modelfile)
+
+[//]: # (ADAPTER ./ollama-lora.bin)
+
+[//]: # (```)
+
+[//]: # ()
+[//]: # (### LICENSE)
+
+[//]: # ()
+[//]: # (The `LICENSE` instruction allows you to specify the legal license under which the model used with this Modelfile is)
+
+[//]: # (shared or distributed.)
+
+[//]: # ()
+[//]: # (```modelfile)
+
+[//]: # (LICENSE """)
+
+[//]: # (<license text>)
+
+[//]: # (""")
+
+[//]: # (```)
+
+[//]: # ()
+[//]: # (## Notes)
+
+[//]: # ()
+[//]: # (- the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to)
+
+[//]: # (  distinguish it from arguments.)
+
+[//]: # (- Instructions can be in any order. In the examples, the `FROM` instruction is first to keep it easily readable.)
+
+[//]: # ()
+[//]: # (Read more about Modelfile: https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md)
--- a/docs/docs/apis-model-management/delete-model.md
+++ b/docs/docs/apis-model-management/delete-model.md
@@ -1,28 +1,13 @@
 ---
-sidebar_position: 5
+sidebar_position: 6
 ---

+import CodeEmbed from '@site/src/components/CodeEmbed';
+
 # Delete Model

 This API lets you create a delete a model from the Ollama server.

-```java title="DeleteModel.java"
-import io.github.ollama4j.OllamaAPI;
-
-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        ollamaAPI.setVerbose(false);
-
-        ollamaAPI.deleteModel("mycustommodel", true);
-    }
-}
-
-```
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/DeleteModel.java" />

 Once deleted, you can verify it using [list models](./list-models) API.
--- a/docs/docs/apis-model-management/get-model-details.md
+++ b/docs/docs/apis-model-management/get-model-details.md
--- a/docs/docs/apis-model-management/list-models.md
+++ b/docs/docs/apis-model-management/list-models.md
@@ -1,35 +1,24 @@
 ---
-sidebar_position: 1
+sidebar_position: 2
 ---

-# List Models
+import CodeEmbed from '@site/src/components/CodeEmbed';

-This API lets you list available models on the Ollama server.
+# List Library Models

-```java title="ListModels.java"
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.models.response.Model;
+This API lets you list downloaded/available models on the Ollama server.

-import java.util.List;
+<CodeEmbed
+src='https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/ListLocalModels.java'>
+</CodeEmbed>

-public class ListModels {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        List<Model> models = ollamaAPI.listModels();
-
-        models.forEach(model -> System.out.println(model.getName()));
-    }
-}
-```

 If you have any models already downloaded on Ollama server, you would have them listed as follows:

 ```bash
 llama2:latest
+llama3.2:1b
+qwen2:0.5b
+qwen:0.5b
 sqlcoder:latest
 ```
--- a/docs/docs/apis-model-management/pull-model.md
+++ b/docs/docs/apis-model-management/pull-model.md
@@ -1,26 +1,15 @@
 ---
-sidebar_position: 2
+sidebar_position: 3
 ---

+import CodeEmbed from '@site/src/components/CodeEmbed';
+
 # Pull Model

 This API lets you pull a model on the Ollama server.

-```java title="PullModel.java"
-import io.github.ollama4j.OllamaAPI;
-import io.github.ollama4j.types.OllamaModelType;
+<CodeEmbed
+src='https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/PullModel.java'>
+</CodeEmbed>

-public class Main {
-
-    public static void main(String[] args) {
-
-        String host = "http://localhost:11434/";
-
-        OllamaAPI ollamaAPI = new OllamaAPI(host);
-
-        ollamaAPI.pullModel(OllamaModelType.LLAMA2);
-    }
-}
-```
-
-Once downloaded, you can see them when you use [list models](./list-models) API.
+Once downloaded, you can see them when you use [list models](./list-models) API.
--- a/docs/docs/intro.md
+++ b/docs/docs/intro.md
@@ -1,19 +1,22 @@
 ---
 sidebar_position: 1
+
+title: Introduction
 ---

+import LatestRelease from '@site/src/components/LatestRelease';
+import AddToYourProject from '@site/src/components/AddToYourProject';
+
 # Introduction

-Let's get started with **Ollama4j**.
-
-## 🦙 What is Ollama?
+### 🦙 What is Ollama?

 [Ollama](https://ollama.ai/) is an advanced AI tool that allows users to easily set up and run large language models
 locally (in CPU and GPU
 modes). With Ollama, users can leverage powerful language models such as Llama 2 and even customize and create their own
 models.

-## 👨‍💻 Why Ollama4j?
+### 👨‍💻 Why Ollama4j?

 Ollama4j was built for the simple purpose of integrating Ollama with Java applications.

@@ -34,16 +37,16 @@ Ollama4j was built for the simple purpose of integrating Ollama with Java applic
    end
 ```

-## Getting Started
+### Getting Started

-### What you'll need
+#### What you'll need

 - **[Ollama](https://ollama.ai/download)**
 - **[Oracle JDK](https://www.oracle.com/java/technologies/javase/jdk11-archive-downloads.html)** or
  **[Open JDK](https://jdk.java.net/archive/)** 11.0 or above.
 - **[Maven](https://maven.apache.org/download.cgi)**

-### Start Ollama server
+#### Start Ollama server

 The easiest way of getting started with Ollama server is with [Docker](https://docs.docker.com/get-started/overview/).
 But if you choose to run the
@@ -69,22 +72,17 @@ terminal of your code editor.

 The command runs the Ollama server locally at **http://localhost:11434/**.

-### Setup your project
-
-Get started by **creating a new Maven project** on your favorite IDE.
+#### Setup your project

 Add the dependency to your project's `pom.xml`.

-```xml
+<AddToYourProject/>

-<dependency>
-    <groupId>io.github.ollama4j</groupId>
-    <artifactId>ollama4j</artifactId>
-    <version>1.0.78</version>
-</dependency>
-```
+<div style={{ marginTop: '2rem', marginBottom: '2rem', fontSize: '1em', textAlign: 'left', display: 'flex', justifyContent: 'left'}}>
+    <LatestRelease showReleaseDate={true} style={{textAlign: 'left', fontWeight: 'normal'}}/>
+</div>

-Find the latest version of the library [here](https://central.sonatype.com/artifact/io.github.ollama4j/ollama4j).
+Find the latest version of the library from [Maven Central Repository](https://central.sonatype.com/artifact/io.github.ollama4j/ollama4j).

 You might want to include an implementation of [SL4J](https://www.slf4j.org/) logger in your `pom.xml` file. For
 example,
@@ -92,7 +90,6 @@ example,
 Use `slf4j-jdk14` implementation:

 ```xml
-
 <dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-jdk14</artifactId>
@@ -103,7 +100,6 @@ Use `slf4j-jdk14` implementation:
 or use `logback-classic` implementation:

 ```xml
-
 <dependency>
    <groupId>ch.qos.logback</groupId>
    <artifactId>logback-classic</artifactId>
@@ -116,14 +112,14 @@ or use other suitable implementations.
 Create a new Java class in your project and add this code.

 ```java
-import io.github.ollama4j.OllamaAPI;
+import io.github.ollama4j.Ollama;

-public class OllamaAPITest {
+public class OllamaTest {

    public static void main(String[] args) {
-        OllamaAPI ollamaAPI = new OllamaAPI();
+        Ollama ollama = new Ollama();

-        boolean isOllamaServerReachable = ollamaAPI.ping();
+        boolean isOllamaServerReachable = ollama.ping();

        System.out.println("Is Ollama server running: " + isOllamaServerReachable);
    }
@@ -134,18 +130,16 @@ This uses the default Ollama host as `http://localhost:11434`.
 Specify a different Ollama host that you want to connect to.

 ```java
-import io.github.ollama4j.OllamaAPI;
+import io.github.ollama4j.Ollama;

-public class OllamaAPITest {
+public class OllamaTest {

    public static void main(String[] args) {
        String host = "http://localhost:11434/";

-        OllamaAPI ollamaAPI = new OllamaAPI(host);
+        Ollama ollama = new Ollama(host);

-        ollamaAPI.setVerbose(true);
-
-        boolean isOllamaServerReachable = ollamaAPI.ping();
+        boolean isOllamaServerReachable = ollama.ping();

        System.out.println("Is Ollama server running: " + isOllamaServerReachable);
    }
--- a/docs/docs/metrics.md
+++ b/docs/docs/metrics.md
@@ -0,0 +1,90 @@
+---
+sidebar_position: 6
+
+title: Metrics
+---
+
+import CodeEmbed from '@site/src/components/CodeEmbed';
+
+# Metrics
+
+:::warning[Note]
+This is work in progress
+:::
+
+Monitoring and understanding the performance of your models and requests is crucial for optimizing and maintaining your
+applications. The Ollama4j library provides built-in support for collecting and exposing various metrics, such as
+request counts, response times, and error rates. These metrics can help you:
+
+- Track usage patterns and identify bottlenecks
+- Monitor the health and reliability of your services
+- Set up alerts for abnormal behavior
+- Gain insights for scaling and optimization
+
+## Available Metrics
+
+Ollama4j exposes several key metrics, including:
+
+- **Total Requests**: The number of requests processed by the model.
+- **Response Time**: The time taken to generate a response for each request.
+- **Error Rate**: The percentage of requests that resulted in errors.
+- **Active Sessions**: The number of concurrent sessions or users.
+
+These metrics can be accessed programmatically or integrated with monitoring tools such as Prometheus or Grafana for
+visualization and alerting.
+
+## Example Metrics Dashboard
+
+Below is an example of a metrics dashboard visualizing some of these key statistics:
+
+![Img](https://raw.githubusercontent.com/ollama4j/ollama4j/main/metrics.png)
+
+## Example: Accessing Metrics in Java
+
+You can easily access and display metrics in your Java application using Ollama4j.
+
+Make sure you have added the `simpleclient_httpserver` dependency in your app for the app to be able to expose the
+metrics via `/metrics` endpoint:
+
+```xml
+
+<dependency>
+    <groupId>io.prometheus</groupId>
+    <artifactId>simpleclient_httpserver</artifactId>
+    <version>0.16.0</version>
+</dependency>
+```
+
+Here is a sample code snippet demonstrating how to retrieve and print metrics on Grafana:
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/src/main/java/io/github/ollama4j/examples/MetricsExample.java" />
+
+This will start a simple HTTP server with `/metrics` endpoint enabled. Metrics will now available
+at: http://localhost:8080/metrics
+
+## Integrating with Monitoring Tools
+
+### Grafana
+
+Use the following sample `docker-compose` file to host a basic Grafana container.
+
+<CodeEmbed src="https://raw.githubusercontent.com/ollama4j/ollama4j-examples/refs/heads/main/docker/docker-compose.yml" />
+
+And run:
+
+```shell
+docker-compose -f path/to/your/docker-compose.yml up
+```
+
+This starts Granfana at http://localhost:3000
+
+
+[//]: # (To integrate Ollama4j metrics with external monitoring systems, you can export the metrics endpoint and configure your)
+
+[//]: # (monitoring tool to scrape or collect the data. Refer to the [integration guide]&#40;../integration/monitoring.md&#41; for)
+
+[//]: # (detailed instructions.)
+
+[//]: # ()
+
+[//]: # (For more information on customizing and extending metrics, see the [API documentation]&#40;../api/metrics.md&#41;.)
--- a/docs/docusaurus.config.js
+++ b/docs/docusaurus.config.js
@@ -24,7 +24,6 @@ const config = {
    projectName: 'ollama4j', // Usually your repo name.

    onBrokenLinks: 'throw',
-    onBrokenMarkdownLinks: 'warn',

    // Even if you don't use internationalization, you can use this field to set
    // useful metadata like html lang. For example, if your site is Chinese, you
@@ -84,6 +83,7 @@ const config = {
                        position: 'left',
                        label: 'Docs',
                    },
+                    {to: 'https://github.com/ollama4j/ollama4j-examples', label: 'Examples', position: 'left'},
                    {to: 'https://ollama4j.github.io/ollama4j/apidocs/', label: 'Javadoc', position: 'left'},
                    {to: 'https://ollama4j.github.io/ollama4j/doxygen/html/', label: 'Doxygen', position: 'left'},
                    {to: '/blog', label: 'Blog', position: 'left'},
@@ -98,11 +98,32 @@ const config = {
                style: 'dark',
                links: [
                    {
-                        title: 'Docs',
+                        title: 'Quick Links',
                        items: [
                            {
-                                label: 'Tutorial',
-                                to: '/intro',
+                                label: 'Ollama4j Examples',
+                                to: 'https://github.com/ollama4j/ollama4j-examples',
+                            },
+                            {
+                                label: 'Blog',
+                                to: '/blog',
+                            },
+                            {
+                                label: 'GitHub',
+                                href: 'https://github.com/ollama4j/ollama4j',
+                            },
+                        ],
+                    },
+                    {
+                        title: 'Stuff built with Ollama4j',
+                        items: [
+                            {
+                                label: 'Ollama4j Web UI',
+                                to: 'https://github.com/ollama4j/ollama4j-web-ui',
+                            },
+                            {
+                                label: 'Ollama4j Desktop UI with Swing',
+                                to: 'https://github.com/ollama4j/ollama4j-ui',
                            },
                        ],
                    },
@@ -118,20 +139,7 @@ const config = {
                                href: 'https://twitter.com/ollama4j',
                            },
                        ],
-                    },
-                    {
-                        title: 'More',
-                        items: [
-                            {
-                                label: 'Blog',
-                                to: '/blog',
-                            },
-                            {
-                                label: 'GitHub',
-                                href: 'https://github.com/ollama4j/ollama4j',
-                            },
-                        ],
-                    },
+                    }
                ],
                copyright: `Ollama4j Documentation ${new Date().getFullYear()}. Built with Docusaurus.`,
            },
@@ -140,9 +148,35 @@ const config = {
                darkTheme: prismThemes.dracula,
                additionalLanguages: ['java'],
            },
+            algolia: {
+                // The application ID provided by Algolia
+                appId: '7HJ3MZ6GBX',
+                // Public API key: it is safe to commit it
+                apiKey: '3037a6d8706a6347b1844ca6ecd582b0',
+                indexName: 'ollama4jio',
+                // Optional: see doc section below
+                contextualSearch: true,
+                // Optional: Specify domains where the navigation should occur through window.location instead on history.push. Useful when our Algolia config crawls multiple documentation sites and we want to navigate with window.location.href to them.
+                externalUrlRegex: 'external\\.com|domain\\.com',
+                // Optional: Replace parts of the item URLs from Algolia. Useful when using the same search index for multiple deployments using a different baseUrl. You can use regexp or string in the `from` param. For example: localhost:3000 vs myCompany.com/docs
+                replaceSearchResultPathname: {
+                    from: '/docs/', // or as RegExp: /\/docs\//
+                    to: '/',
+                },
+                // Optional: Algolia search parameters
+                searchParameters: {},
+                // Optional: path for search page that enabled by default (`false` to disable it)
+                searchPagePath: 'search',
+                // Optional: whether the insights feature is enabled or not on Docsearch (`false` by default)
+                insights: false,
+                //... other Algolia params
+            },
        }),
    markdown: {
        mermaid: true,
+        hooks: {
+            onBrokenMarkdownLinks: 'warn'
+        }
    },
    themes: ['@docusaurus/theme-mermaid']
 };
--- a/docs/package-lock.json
+++ b/docs/package-lock.json
--- a/docs/package.json
+++ b/docs/package.json
@@ -1,46 +1,51 @@
 {
-  "name": "ollama-4-j",
-  "version": "0.0.0",
-  "private": true,
-  "scripts": {
-    "docusaurus": "docusaurus",
-    "start": "docusaurus start",
-    "build": "docusaurus build",
-    "swizzle": "docusaurus swizzle",
-    "deploy": "docusaurus deploy",
-    "clear": "docusaurus clear",
-    "serve": "docusaurus serve",
-    "write-translations": "docusaurus write-translations",
-    "write-heading-ids": "docusaurus write-heading-ids"
-  },
-  "dependencies": {
-    "@docusaurus/core": "^3.4.0",
-    "@docusaurus/plugin-google-gtag": "^3.4.0",
-    "@docusaurus/preset-classic": "^3.4.0",
-    "@docusaurus/theme-mermaid": "^3.4.0",
-    "@mdx-js/react": "^3.0.0",
-    "clsx": "^2.0.0",
-    "prism-react-renderer": "^2.3.0",
-    "react": "^18.0.0",
-    "react-dom": "^18.0.0"
-  },
-  "devDependencies": {
-    "@docusaurus/module-type-aliases": "^3.4.0",
-    "@docusaurus/types": "^3.4.0"
-  },
-  "browserslist": {
-    "production": [
-      ">0.5%",
-      "not dead",
-      "not op_mini all"
-    ],
-    "development": [
-      "last 3 chrome version",
-      "last 3 firefox version",
-      "last 5 safari version"
-    ]
-  },
-  "engines": {
-    "node": ">=18.0"
-  }
+    "name": "ollama-4-j",
+    "version": "0.0.0",
+    "private": true,
+    "scripts": {
+        "docusaurus": "docusaurus",
+        "start": "docusaurus start",
+        "build": "docusaurus build",
+        "swizzle": "docusaurus swizzle",
+        "deploy": "docusaurus deploy",
+        "clear": "docusaurus clear",
+        "serve": "docusaurus serve",
+        "write-translations": "docusaurus write-translations",
+        "write-heading-ids": "docusaurus write-heading-ids"
+    },
+    "dependencies": {
+        "@docsearch/js": "^4.1.0",
+        "@docusaurus/core": "^3.9.0",
+        "@docusaurus/plugin-google-gtag": "^3.9.1",
+        "@docusaurus/preset-classic": "^3.9.1",
+        "@docusaurus/theme-mermaid": "^3.9.1",
+        "@iconify/react": "^6.0.2",
+        "@mdx-js/react": "^3.1.1",
+        "clsx": "^2.1.1",
+        "font-awesome": "^4.7.0",
+        "prism-react-renderer": "^2.4.1",
+        "react": "^19.2.0",
+        "react-dom": "^19.2.0",
+        "react-icons": "^5.5.0",
+        "react-image-gallery": "^1.4.0"
+    },
+    "devDependencies": {
+        "@docusaurus/module-type-aliases": "^3.9.1",
+        "@docusaurus/types": "^3.4.0"
+    },
+    "browserslist": {
+        "production": [
+            ">0.5%",
+            "not dead",
+            "not op_mini all"
+        ],
+        "development": [
+            "last 3 chrome version",
+            "last 3 firefox version",
+            "last 5 safari version"
+        ]
+    },
+    "engines": {
+        "node": ">=18.0"
+    }
 }
--- a/docs/src/components/AddToYourProject/index.js
+++ b/docs/src/components/AddToYourProject/index.js
@@ -0,0 +1,65 @@
+import React, { useState, useEffect } from 'react';
+import CodeBlock from '@theme/CodeBlock';
+
+const AddToYourProject = () => {
+    const [releaseInfo, setReleaseInfo] = useState(null);
+    const [loading, setLoading] = useState(true);
+    const [error, setError] = useState(null);
+
+    useEffect(() => {
+        const fetchLatestRelease = async () => {
+            setLoading(true);
+            setError(null);
+            try {
+                const response = await fetch('https://api.github.com/repos/ollama4j/ollama4j/releases/latest');
+                if (!response.ok) {
+                    throw new Error(`HTTP error! status: ${response.status}`);
+                }
+                const data = await response.json();
+                setReleaseInfo(data);
+            } catch (err) {
+                console.error('Failed to fetch release info:', err);
+                setError(err);
+            } finally {
+                setLoading(false);
+            }
+        };
+
+        fetchLatestRelease();
+    }, []);
+
+    return (
+        <div style={{ width: '100%' }}>
+            {loading ? (
+                <div>Loading latest release info...</div>
+            ) : error ? (
+                <div>Error: {error.message}</div>
+            ) : releaseInfo ? (
+                <>
+                    <h4>Using Maven <code>pom.xml</code></h4>
+                    <CodeBlock className="language-xml">
+                        {`<dependency>
+    <groupId>io.github.ollama4j</groupId>
+    <artifactId>ollama4j</artifactId>
+    <version>${releaseInfo.name}</version>
+</dependency>`}
+                    </CodeBlock>
+                    <h4>Using Groovy-based <code>build.gradle</code></h4>
+                    <CodeBlock className="language-groovy">
+                        {`dependencies {
+    implementation 'io.github.ollama4j:ollama4j:${releaseInfo.name}'
+}`}
+                    </CodeBlock>
+                    <h4>For Kotlin-based <code>build.gradle.kts</code></h4>
+                    <CodeBlock className="language-kotlin">
+                        {`dependencies {
+    implementation("io.github.ollama4j:ollama4j:${releaseInfo.name}")
+}`}
+                    </CodeBlock>
+                </>
+            ) : null}
+        </div>
+    );
+};
+
+export default AddToYourProject;
--- a/docs/src/components/CodeEmbed/index.js
+++ b/docs/src/components/CodeEmbed/index.js
@@ -0,0 +1,106 @@
+import React, {useState, useEffect} from 'react';
+import CodeBlock from '@theme/CodeBlock';
+import Icon from '@site/src/components/Icon';
+
+/**
+ * CodeEmbed component to display code fetched from a URL in a CodeBlock.
+ * @param {object} props
+ * @param {string} props.src - Source URL to fetch the code from.
+ * @param {string} [props.language='java'] - Language for syntax highlighting in CodeBlock.
+ */
+const CodeEmbed = ({src, language = 'java'}) => {
+    const [code, setCode] = useState('');
+    const [loading, setLoading] = useState(true);
+    const [error, setError] = useState(null);
+
+    useEffect(() => {
+        let isMounted = true;
+
+        const fetchCodeFromUrl = async (url) => {
+            if (!isMounted) return;
+
+            setLoading(true);
+            setError(null);
+
+            try {
+                const response = await fetch(url);
+                if (!response.ok) {
+                    throw new Error(`HTTP error! status: ${response.status}`);
+                }
+                const data = await response.text();
+                if (isMounted) {
+                    setCode(data);
+                }
+            } catch (err) {
+                console.error('Failed to fetch code:', err);
+                if (isMounted) {
+                    setError(err);
+                    setCode(`// Failed to load code from ${url}\n// ${err.message}`);
+                }
+            } finally {
+                if (isMounted) {
+                    setLoading(false);
+                }
+            }
+        };
+
+        if (src) {
+            fetchCodeFromUrl(src);
+        }
+
+        return () => {
+            isMounted = false;
+        };
+    }, [src]);
+
+    const githubUrl = src ? src.replace('https://raw.githubusercontent.com', 'https://github.com').replace('/refs/heads/', '/blob/') : null;
+    const fileName = src ? src.substring(src.lastIndexOf('/') + 1) : null;
+
+    const title = (
+        <div style={{display: 'flex', justifyContent: 'space-between', alignItems: 'center'}}>
+            <a
+                href={githubUrl}
+                target="_blank"
+                rel="noopener noreferrer"
+                style={{
+                    color: 'gray',
+                    textDecoration: 'none',
+                }}
+                onMouseOver={e => {
+                    e.target.style.textDecoration = 'underline';
+                }}
+                onMouseOut={e => {
+                    e.target.style.textDecoration = 'none';
+                }}
+            >
+                <span>{fileName}</span>
+            </a>
+            {githubUrl && (
+                <a href={githubUrl} target="_blank" rel="noopener noreferrer" style={{
+                    color: 'gray',
+                    fontSize: '0.9em',
+                    fontStyle: 'italic',
+                    display: 'inline-flex',
+                    alignItems: 'center'
+                }}>
+                    View on GitHub
+                    <Icon icon="mdi:github" height="1em"/>
+                </a>
+            )}
+        </div>
+    );
+
+    return (
+        loading ? (
+            <div>Loading code...</div>
+        ) : error ? (
+            <div>Error: {error.message}</div>
+        ) : (
+            <div style={{backgroundColor: 'transparent', padding: '0px', borderRadius: '5px'}}>
+                <CodeBlock title={title} language={language}>{code}</CodeBlock>
+            </div>
+        )
+    );
+};
+
+export default CodeEmbed;
--- a/docs/src/components/HomepageFeatures/index.js
+++ b/docs/src/components/HomepageFeatures/index.js
@@ -1,39 +1,107 @@
 import clsx from 'clsx';
 import Heading from '@theme/Heading';
 import styles from './styles.module.css';
+import Ollama4jUIImageCarousel from '@site/src/components/Ollama4jUIImageCarousel';

 const FeatureList = [
-  {
-    title: 'Easy LLM Integration',
-    Svg: require('@site/static/img/undraw_docusaurus_mountain.svg').default,
-    description: (
-      <>
-        Easy integration with Ollama, enabling the execution of large language models locally.
-
-      </>
-    ),
-  },
  {
    title: 'Developer-Friendly',
    Svg: require('@site/static/img/undraw_docusaurus_tree.svg').default,
    description: (
      <>
-        Clean and simple <code>APIs</code>, focused on seamless interaction with Ollama.
+        Provides clean and simple <code>APIs</code> for interacting with Ollama, including model management, chat functionalities, and authentication capability when Ollama is deployed behind proxy server.
      </>
    ),
  },
  {
-    title: 'Powered by Java',
+    title: 'Chat with Images',
+    Svg: require('@site/static/img/undraw_docusaurus_mountain.svg').default,
+    description: (
+      <>
+        Supports interactions with vision/image models, allowing you to build applications that can understand and respond to visual content.  Upload images directly into your chat sessions and receive intelligent, context-aware replies.
+      </>
+    ),
+  },
+  {
+    title: 'Tools Support',
    Svg: require('@site/static/img/undraw_docusaurus_react.svg').default,
    description: (
      <>
-        Empowers Java developers to harness the full capabilities of Ollama.
+        Supports tool/function calling with tool calling models such as mistral, llama3.x, qwen, etc. Empower your models to interact with external services and data sources, enabling more complex and dynamic AI workflows.
      </>
    ),
  },
 ];
+const UsageList = [
+  {
+    title: 'Datafaker',
+    Svg: require('@site/static/img/datafaker.svg').default,
+    link: 'https://www.datafaker.net/',
+    description: (
+      <>
+        A powerful fake data generation library designed for JVM programs, offering over 200 data providers to easily create realistic and diverse datasets within minutes.
+      </>
+    ),
+    imageSize: '20%',
+  },
+  {
+    title: 'Katie',
+    Svg: require('@site/static/img/katie_logo_v3.svg').default,
+    link: 'https://katie.qa/home',
+    description: (
+      <>
+        An Open Source AI-based question-answering platform that helps companies and organizations make their private domain knowledge accessible and useful to their employees and customers.
+      </>
+    ),
+    imageSize: '30%',
+  },
+  {
+    title: 'AI Player',
+    Svg: require('@site/static/img/ai-player.svg').default,
+    link: 'https://modrinth.com/mod/ai-player',
+    description: (
+      <>
+        A minecraft mod which aims to add a "second player" into the game which will actually be intelligent.
+      </>
+    ),
+    imageSize: '15%',
+  },
+  {
+    title: 'Ollama Translator Plugin',
+    Svg: require('@site/static/img/minecraft-spigot.svg').default,
+    link: 'https://github.com/liebki/ollama-translator',
+    description: (
+      <>
+        A minecraft 1.21 spigot plugin allows to easily break language barriers by using ollama on the server to translate all messages into a specfic target language.
+      </>
+    ),
+    imageSize: '20%',
+  },
+  {
+    title: 'JnsCLI',
+    Svg: require('@site/static/img/jnscli.svg').default,
+    link: 'https://github.com/mirum8/jnscli',
+    description: (
+      <>
+        JnsCLI is a command-line tool for Jenkins, allowing you to manage jobs, builds, and configurations directly from the terminal. It also features AI-powered error analysis for quick troubleshooting.
+      </>
+    ),
+    imageSize: '20%',
+  },
+  {
+    title: 'Featured in a Research Article on AI-Assisted Code Optimization',
+    Svg: require('@site/static/img/pmc-logo.svg').default,
+    link: 'https://pmc.ncbi.nlm.nih.gov/articles/PMC11750896/',
+    description: (
+      <>
+        Ollama4j was used in a research article – “Large Language Model Based Mutations in Genetic Improvement” (PubMed Central).
+      </>
+    ),
+    imageSize: '50%',
+  },
+];

-function Feature({Svg, title, description}) {
+function Feature({ Svg, title, description }) {
  return (
    <div className={clsx('col col--4')}>
      <div className="text--center">
@@ -47,16 +115,58 @@ function Feature({Svg, title, description}) {
  );
 }

-export default function HomepageFeatures() {
+function Usage({ Svg, title, description, link, imageSize }) {
  return (
-    <section className={styles.features}>
-      <div className="container">
-        <div className="row">
-          {FeatureList.map((props, idx) => (
-            <Feature key={idx} {...props} />
-          ))}
-        </div>
+    <div className={clsx('col col--4')}>
+      <div className="text--center">
+        <Svg className={styles.featureSvg} role="img" style={{ width: imageSize }} />
      </div>
-    </section>
+      <div className="text--center padding-horiz--md">
+        <Heading as="h3" style={{ color: 'red' }}>
+          {link ? (
+            <a href={link} target="_blank" rel="noopener noreferrer" style={{ color: '#11bc11' }}>
+              {title}
+            </a>
+          ) : (
+            <span style={{ color: 'red' }}>{title}</span>
+          )}
+        </Heading>
+        <p>{description}</p>
+      </div>
+    </div>
+  );
+}
+
+export default function HomepageFeatures() {
+  return (
+    <>
+      <section className={styles.features}>
+        <div className="container">
+          <div className="row">
+            {FeatureList.map((props, idx) => (
+              <Feature key={idx} {...props} />
+            ))}
+          </div>
+        </div>
+      </section>
+      <hr />
+      <section className={styles.features}>
+        <div className="container">
+          <div style={{ fontSize: '20px', fontWeight: 'bold', display: 'flex', flexDirection: 'column', alignItems: 'center' }}>
+            Check out who's using Ollama4j
+          </div>
+          <div className="row">
+            {UsageList.map((props, idx) => (
+              <Usage key={idx} {...props} />
+            ))}
+          </div>
+        </div>
+      </section>
+      <section className={styles.features}>
+        <div className="container">
+          <Ollama4jUIImageCarousel></Ollama4jUIImageCarousel>
+        </div>
+      </section>
+    </>
  );
 }
--- a/docs/src/components/Icon/index.js
+++ b/docs/src/components/Icon/index.js
@@ -0,0 +1,9 @@
+// @site/src/components/Icon.js
+import React from 'react';
+import { Icon as IconifyIcon } from '@iconify/react';
+
+const IIcon = ({ icon, color, width = '24', height = '24' }) => (
+  <IconifyIcon icon={icon} color={color} width={width} height={height} />
+);
+
+export default IIcon;
--- a/docs/src/components/LatestRelease/index.js
+++ b/docs/src/components/LatestRelease/index.js
@@ -0,0 +1,52 @@
+import React, { useState, useEffect } from 'react';
+
+const LatestRelease = ({ showReleaseDate }) => {
+    const [releaseInfo, setReleaseInfo] = useState(null);
+    const [loading, setLoading] = useState(true);
+    const [error, setError] = useState(null);
+
+    useEffect(() => {
+        const fetchLatestRelease = async () => {
+            setLoading(true);
+            setError(null);
+            try {
+                const response = await fetch('https://api.github.com/repos/ollama4j/ollama4j/releases/latest');
+                if (!response.ok) {
+                    throw new Error(`HTTP error! status: ${response.status}`);
+                }
+                const data = await response.json();
+                setReleaseInfo(data);
+            } catch (err) {
+                console.error('Failed to fetch release info:', err);
+                setError(err);
+            } finally {
+                setLoading(false);
+            }
+        };
+
+        fetchLatestRelease();
+    }, []);
+
+    return (
+        <div style={{ display: 'flex', justifyContent: 'center' }}>
+            {loading ? (
+                <div>Loading latest release info...</div>
+            ) : error ? (
+                <div>Error: {error.message}</div>
+            ) : releaseInfo ? (
+                <div>
+                    {/* <h4 style={{ display: 'flex', justifyContent: 'center'}}>Latest Release</h4> */}
+                    <div>
+                        <span style={{ fontWeight: 'bold'}}>Latest Version</span>: <a href={releaseInfo.html_url} target='_blank' rel="noopener noreferrer"><span style={{color: 'white', fontWeight: 'bold', backgroundColor:'#11bc11', borderRadius: '15px', padding: '5px'}}>{releaseInfo.name}</span></a>
+                        {showReleaseDate && ` released on ${new Date(releaseInfo.published_at).toLocaleDateString(undefined, { year: 'numeric', month: 'long', day: 'numeric' })}`}
+                    </div>
+                    {/* <pre style={{ whiteSpace: 'pre-wrap' }}>
+                        {JSON.stringify(releaseInfo, null, 2)}
+                    </pre> */}
+                </div>
+            ) : null}
+        </div>
+    );
+};
+
+export default LatestRelease;
--- a/docs/src/components/Ollama4jUIImageCarousel/index.js
+++ b/docs/src/components/Ollama4jUIImageCarousel/index.js
@@ -0,0 +1,54 @@
+import React from 'react';
+import "react-image-gallery/styles/css/image-gallery.css";
+import ImageGallery from "react-image-gallery";
+
+const images = [
+    {
+        original: require('@site/static/img/ollama4j-ui/1.png').default,
+        thumbnail: require('@site/static/img/ollama4j-ui/1.png').default,
+    },
+    {
+        original: require('@site/static/img/ollama4j-ui/2.png').default,
+        thumbnail: require('@site/static/img/ollama4j-ui/2.png').default,
+    },
+    {
+        original: require('@site/static/img/ollama4j-ui/3.png').default,
+        thumbnail: require('@site/static/img/ollama4j-ui/3.png').default,
+    },
+    {
+        original: require('@site/static/img/ollama4j-ui/4.png').default,
+        thumbnail: require('@site/static/img/ollama4j-ui/4.png').default,
+    },
+];
+
+class Ollama4jUIImageCarousel extends React.Component {
+    renderItem = (item) => {
+        return (
+            <div className="image-gallery-image" style={{ textAlign: 'center' }}>
+                <img
+                    src={item.original}
+                    alt=""
+                    style={{ maxHeight: '500px', width: 'auto', maxWidth: '100%' }}
+                />
+            </div>
+        );
+    };
+
+    render() {
+        return (
+            <div style={{ margin: '0 auto', maxWidth: '800px' }}>
+                <div style={{ display: 'flex', flexDirection: 'column', justifyContent: 'center', alignItems: 'center', textAlign: 'center' }}>
+                    <h2>Explore the stuff we have built with Ollama4j</h2>
+                    <h4>
+                        <a href="https://github.com/ollama4j/ollama4j-ui" target='_blank' rel="noopener noreferrer">
+                            Ollama4j UI - Desktop UI built in Java with Swing
+                        </a>
+                    </h4>
+                </div>
+                <ImageGallery items={images} renderItem={this.renderItem} />
+            </div>
+        );
+    }
+}
+
+export default Ollama4jUIImageCarousel;
--- a/docs/src/components/TypewriterTextarea/index.js
+++ b/docs/src/components/TypewriterTextarea/index.js
@@ -0,0 +1,73 @@
+import React, { useEffect, useState, useRef } from 'react';
+
+const TypewriterTextarea = ({
+  textContent,
+  typingSpeed = 50,
+  pauseBetweenSentences = 1000,
+  height = '200px',
+  width = '100%',
+  align = 'left',
+  style = {},
+}) => {
+  const [displayedText, setDisplayedText] = useState('');
+  const [charIndex, setCharIndex] = useState(0);
+  const isTyping = useRef(false);
+
+  // Flatten textContent to a string, preserving \n
+  const fullText = textContent || '';
+
+  useEffect(() => {
+    if (!fullText) return;
+
+    if (!isTyping.current) {
+      isTyping.current = true;
+    }
+
+    if (charIndex > fullText.length) {
+      // Reset to start from the beginning
+      setCharIndex(0);
+      setDisplayedText('');
+      return;
+    }
+
+    if (charIndex < fullText.length) {
+      const timeout = setTimeout(() => {
+        setDisplayedText(fullText.slice(0, charIndex + 1));
+        setCharIndex((prevCharIndex) => prevCharIndex + 1);
+      }, fullText[charIndex] === '\n' ? typingSpeed : typingSpeed);
+      return () => clearTimeout(timeout);
+    } else {
+      // Wait a bit, then restart
+      const timeout = setTimeout(() => {
+        setCharIndex(0);
+        setDisplayedText('');
+      }, pauseBetweenSentences);
+      return () => clearTimeout(timeout);
+    }
+    // eslint-disable-next-line
+  }, [charIndex, fullText, typingSpeed, pauseBetweenSentences]);
+
+  return (
+    <div
+      style={{
+        width: typeof width === 'number' ? `${width}px` : width,
+        height: height,
+        padding: '1rem',
+        fontFamily: 'monospace',
+        fontSize: '1rem',
+        backgroundColor: '#f4f4f4',
+        border: '1px solid #ccc',
+        textAlign: align,
+        resize: 'none',
+        whiteSpace: 'pre-wrap',
+        color: 'black',
+        overflow: 'auto',
+        ...style,
+      }}
+    >
+      {displayedText}
+    </div>
+  );
+};
+
+export default TypewriterTextarea;
--- a/docs/src/css/custom.css
+++ b/docs/src/css/custom.css
@@ -4,6 +4,8 @@
 * work well for content-centric websites.
 */

+ @import 'font-awesome/css/font-awesome.min.css';
+
 /* You can override the default Infima variables here. */
 :root {
    --ifm-color-primary: #2e8555;
--- a/docs/src/pages/index.js
+++ b/docs/src/pages/index.js
@@ -7,39 +7,60 @@ import BuyMeACoffee from '@site/src/components/BuyMeACoffee';
 import Heading from '@theme/Heading';
 import styles from './index.module.css';
 import BrowserOnly from '@docusaurus/BrowserOnly';
+import LatestRelease from '@site/src/components/LatestRelease';
+import TypewriterTextarea from '@site/src/components/TypewriterTextarea';

 function HomepageHeader() {
  const {siteConfig} = useDocusaurusContext();
-  return (<header className={clsx('hero hero--primary', styles.heroBanner)}>
-    <div className="container">
-      <Heading as="h1" className="hero__title">
-        {siteConfig.title}
-      </Heading>
-      <img src="img/logo.svg" alt="Ollama4j Logo" className={styles.logo}
-           style={{maxWidth: '20vh'}}/>
-      <p className="hero__subtitle">{siteConfig.tagline}</p>
-      <div className={styles.buttons}>
-        <Link
-            className="button button--secondary button--lg"
-            to="/intro">
-          Getting Started
-        </Link>
+  return (
+    <header className={clsx('hero hero--primary', styles.heroBanner)}>
+      <div className="container">
+        <Heading as="h1" className="hero__title">
+          {siteConfig.title}
+        </Heading>
+        <img
+          src="img/logo.svg"
+          alt="Ollama4j Logo"
+          className={styles.logo}
+          style={{ maxWidth: '20vh' }}
+        />
+        <p className="hero__subtitle">{siteConfig.tagline}</p>
+        <div style={{ marginTop: '2rem' }}>
+          <TypewriterTextarea
+            textContent='Hello there! I’m a handy little Java library that helps you talk to an Ollama server — nice and easy.'
+            typingSpeed={30}
+            pauseBetweenSentences={1200}
+            height='130px'
+            width='100%'
+            align='center'
+          />
+        </div>
+        <div className={styles.buttons} >
+          <Link className="button button--secondary button--lg" to="/intro" style={{ marginTop:'2rem' }}>
+            Get Started
+          </Link>
+        </div>
+        <div style={{ marginTop: '3rem' }}>
+          <LatestRelease showReleaseDate={false} />
+        </div>
      </div>
-    </div>
-  </header>);
+    </header>
+  );
 }

 export default function Home() {
  const {siteConfig} = useDocusaurusContext();
-  return (<Layout
-      title={`Hello from ${siteConfig.title}`}
+  return (
+    <Layout
+      title={`${siteConfig.title}`}
      description="Description will go into a meta tag in <head />">
-    <HomepageHeader/>
-    <main>
-      <HomepageFeatures/>
-      <BrowserOnly>
-        {() => <BuyMeACoffee />}
-      </BrowserOnly>
-    </main>
-  </Layout>);
+      <HomepageHeader />
+      <main>
+        <HomepageFeatures />
+        <BrowserOnly>
+          {() => <BuyMeACoffee />}
+        </BrowserOnly>
+      </main>
+    </Layout>
+  );
 }
--- a/docs/static/img/ai-player.svg
+++ b/docs/static/img/ai-player.svg
@@ -0,0 +1,7 @@
+<svg width="96" height="96" viewBox="0 0 96 96" xmlns="http://www.w3.org/2000/svg">
+  <rect width="96" height="96" rx="16" fill="#3F0F8D"/>
+  <path d="M26 24H70V32H26V24Z" fill="white"/>
+  <path d="M36 44H44V52H36V44Z" fill="white"/>
+  <path d="M52 44H60V52H52V44Z" fill="white"/>
+  <path d="M36 60H60V68H36V60Z" fill="white"/>
+</svg>
--- a/docs/static/img/datafaker.svg
+++ b/docs/static/img/datafaker.svg
@@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path fill="gray" d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54z"></path></svg>
--- a/docs/static/img/jnscli.svg
+++ b/docs/static/img/jnscli.svg
--- a/docs/static/img/katie_logo_v3.svg
+++ b/docs/static/img/katie_logo_v3.svg
@@ -0,0 +1,13 @@
+<svg width="539" height="122" viewBox="0 0 539 122" fill="none" xmlns="http://www.w3.org/2000/svg">
+<path d="M233.794 106.267L194.606 73.1369L230.215 42.896C233.395 40.1969 233.755 35.4621 231.017 32.3237C228.282 29.1851 223.482 28.8263 220.301 31.533L177.914 67.5271V10.039C177.914 5.89749 174.511 2.54163 170.316 2.54163C166.119 2.54163 162.717 5.89749 162.717 10.039V111.961C162.717 116.103 166.119 119.458 170.316 119.458C174.511 119.458 177.914 116.103 177.914 111.961V78.7709L223.905 117.655C225.339 118.865 227.096 119.458 228.847 119.458C230.988 119.458 233.116 118.571 234.62 116.84C237.35 113.694 236.981 108.961 233.794 106.267Z" fill="#414041"/>
+<path d="M323.056 30.4831C318.932 30.4831 315.592 33.8533 315.592 38.0129V42.0425C307.753 33.3984 296.966 27.9584 285.263 27.9584C262.554 27.9584 244.075 49.0502 244.075 74.9781C244.075 100.907 262.554 122 285.263 122C298.217 122 308.397 116.775 315.592 108.901V112.96C315.592 117.12 318.932 120.49 323.056 120.49C327.178 120.49 330.519 117.12 330.519 112.96V38.0129C330.519 33.8533 327.178 30.4831 323.056 30.4831ZM285.263 106.941C271.027 106.941 259.002 92.3023 259.002 74.9781C259.002 57.6537 271.027 43.018 285.263 43.018C298.803 43.018 313.299 55.8594 313.299 74.9781C313.299 95.962 299.197 106.941 285.263 106.941Z" fill="#414041"/>
+<path d="M411.244 105.137C409.576 101.309 405.12 99.5521 401.277 101.222C400.739 101.454 400.076 101.763 399.304 102.121C395.776 103.764 385.034 108.765 380.927 106.146C378.976 104.904 377.859 100.863 377.859 95.0546V43.0501H394.522C398.705 43.0501 402.096 39.6633 402.096 35.4831C402.096 31.3029 398.705 27.9159 394.522 27.9159H377.859V7.56714C377.859 3.38695 374.47 0 370.285 0C366.102 0 362.711 3.38695 362.711 7.56714V27.9159H350.805C346.621 27.9159 343.231 31.3029 343.231 35.4831C343.231 39.6633 346.621 43.0501 350.805 43.0501H362.711V95.0546C362.711 106.629 366.102 114.654 372.785 118.909C376.316 121.155 380.207 122 384.1 122C392.482 122 400.877 118.088 405.707 115.839C406.339 115.544 406.885 115.288 407.327 115.095C411.161 113.428 412.917 108.969 411.244 105.137Z" fill="#414041"/>
+<path d="M502.288 122C487.874 122 460.527 111.883 460.187 73.5909C460.052 58.2558 464.703 45.0328 473.288 36.3573C480.37 29.1992 489.739 25.4166 500.381 25.4166C526.708 25.4166 537.58 47.0116 538.976 67.2225C539.168 69.9859 538.2 72.7323 536.321 74.7481C534.467 76.7394 531.854 77.8793 529.147 77.8793H475.52C477.843 105.988 499.689 106.84 502.288 106.84C516.078 106.84 523.655 98.4106 523.729 98.327C526.452 95.1443 531.226 94.7713 534.398 97.4901C537.577 100.209 537.951 104.994 535.234 108.177C534.754 108.742 523.197 122 502.288 122ZM476.301 62.7194H523.227C519.981 43.6041 507.776 40.5766 500.381 40.5766C493.774 40.5766 488.273 42.7479 484.036 47.0289C480.257 50.851 477.621 56.2473 476.301 62.7194Z" fill="#414041"/>
+<path d="M434.759 33.0416C430.546 33.0416 427.132 36.4385 427.132 40.6307V111.869C427.132 116.061 430.546 119.458 434.759 119.458C438.973 119.458 442.387 116.061 442.387 111.869V40.6307C442.387 36.4385 438.973 33.0416 434.759 33.0416Z" fill="#414041"/>
+<path d="M434.759 22.875C440.376 22.875 444.929 18.3232 444.929 12.7083C444.929 7.09347 440.376 2.54163 434.759 2.54163C429.142 2.54163 424.59 7.09347 424.59 12.7083C424.59 18.3232 429.142 22.875 434.759 22.875Z" fill="#414041"/>
+<path d="M111.868 0H10.1698C4.55318 0 0 4.55177 0 10.1667V111.833C0 117.448 4.55318 122 10.1698 122H111.868C117.485 122 122.038 117.448 122.038 111.833V10.1667C122.038 4.55177 117.485 0 111.868 0Z" fill="#FF89BB"/>
+<path d="M119.138 3H3.10229V119H119.138V3Z" fill="#FF89BB"/>
+<path d="M37.0639 65.2438C41.7531 65.2438 45.5543 60.8104 45.5543 55.3414C45.5543 49.8724 41.7531 45.439 37.0639 45.439C32.3748 45.439 28.5735 49.8724 28.5735 55.3414C28.5735 60.8104 32.3748 65.2438 37.0639 65.2438Z" fill="#414041"/>
+<path d="M82.3464 56.756C87.0355 56.756 90.8368 52.3226 90.8368 46.8536C90.8368 41.3846 87.0355 36.9512 82.3464 36.9512C77.6573 36.9512 73.856 41.3846 73.856 46.8536C73.856 52.3226 77.6573 56.756 82.3464 56.756Z" fill="#414041"/>
+<path d="M59.8714 99.1951C42.5471 99.1951 30.8795 89.4538 30.3195 88.985C28.2142 87.1935 27.9803 84.0583 29.7949 81.9799C31.6096 79.9085 34.771 79.6705 36.8763 81.448C37.0748 81.616 49.2599 91.5603 65.9462 88.747C83.0366 85.8638 93.003 73.0154 93.1022 72.8894C94.768 70.699 97.9153 70.2511 100.134 71.8887C102.353 73.5262 102.821 76.6194 101.169 78.8168C100.673 79.4746 88.8208 94.9683 67.6333 98.5373C64.9468 98.9992 62.3453 99.1951 59.8714 99.1951Z" fill="#414041"/>
+</svg>
--- a/docs/static/img/minecraft-spigot.svg
+++ b/docs/static/img/minecraft-spigot.svg
@@ -0,0 +1 @@
+<svg xmlns="http://www.w3.org/2000/svg" shape-rendering="geometricPrecision" text-rendering="geometricPrecision" image-rendering="optimizeQuality" fill-rule="evenodd" clip-rule="evenodd" viewBox="0 0 512 448.012"><path fill="#424C57" fill-rule="nonzero" d="M386.912 404.082c-31.653 27.804-68.134 42.5-104.037 43.83-37.327 1.383-73.956-11.673-103.851-39.445l-15.148-14.072-54.139-44.075h-48.07c-25.406 0-43.633-20.936-46.214-46.212L.195 154.71c-2.581-25.279 20.879-48.258 46.214-46.215l189.276 15.258c25.335 2.042 46.214 20.793 46.214 46.215v134.14c0 25.418-20.805 46.212-46.214 46.212h-59.323l4.221 47.53 4.398 2.497c33.226 18.862 65.806 26.928 96.721 22.849 28.258-3.73 55.317-17.66 80.412-42.828l-15.449-13.603 64.278-14.309-7.532 67.307-16.499-15.681z"/><path fill="#337EC6" d="M276.315 73.91l113.118-9.119-30.176-17.127c-33.228-18.859-65.809-26.927-96.721-22.848-28.26 3.73-55.319 17.66-80.414 42.827l15.449 13.604-64.277 14.309 7.533-67.307 16.498 15.682C188.979 16.128 225.46 1.43 261.363.099c37.327-1.382 73.956 11.673 103.851 39.446l26.942 25.027 73.434-5.92c25.335-2.042 48.796 20.936 46.215 46.216l-15.258 149.397c-2.581 25.277-20.808 46.212-46.215 46.212h-50.099l-71.815 58.466 5.19-58.466h-57.293c-25.411 0-46.216-20.793-46.216-46.212V120.126c0-25.422 20.881-44.173 46.216-46.216z"/><path fill="#fff" d="M84.246 173.714h43.841v-12.353h19.585v12.353h44.001v20.281h-9.033c-1.629 12.811-5.21 23.368-11.297 34.746-5.149 9.611-11.796 18.87-19.516 27.744 12.32 14.735 27.616 27.554 45.826 39.24l-10.073 15.701c-19.119-12.267-35.356-25.774-48.699-41.332-11.936 11.547-25.496 22.303-39.7 32.127l-10.588-15.325c14.055-9.725 27.407-20.347 38.896-31.672-10.588-15.493-18.697-32.88-24.308-52.875l17.968-5.038c4.601 16.401 11.052 30.837 19.344 43.759 5.657-6.848 10.548-13.876 14.412-21.092 4.74-8.866 7.286-16.196 8.89-25.983H84.246v-20.281zM398.456 215.314h-55.931l-7.357 25.09h-27.814c11.988-31.729 25.831-69.829 37.825-101.573 4.318-11.466 9.234-30.467 24.943-30.467 16.284 0 21.691 17.409 26.222 29.436l38.4 103.122h-28.708l-7.58-25.608zm-6-24.398l-21.893-57.703-21.971 57.703h43.864z"/></svg>
--- a/docs/static/img/ollama4j-ui/1.png
+++ b/docs/static/img/ollama4j-ui/1.png
--- a/docs/static/img/ollama4j-ui/2.png
+++ b/docs/static/img/ollama4j-ui/2.png
--- a/docs/static/img/ollama4j-ui/3.png
+++ b/docs/static/img/ollama4j-ui/3.png
--- a/docs/static/img/ollama4j-ui/4.png
+++ b/docs/static/img/ollama4j-ui/4.png
--- a/docs/static/img/pmc-logo.svg
+++ b/docs/static/img/pmc-logo.svg
--- a/metrics.png
+++ b/metrics.png
--- a/ollama4j-new.jpeg
+++ b/ollama4j-new.jpeg
--- a/pom.xml
+++ b/pom.xml
@@ -13,12 +13,13 @@
    <packaging>jar</packaging>

    <properties>
-        <maven.compiler.source>11</maven.compiler.source>
-        <maven.compiler.target>11</maven.compiler.target>
+        <maven.compiler.release>11</maven.compiler.release>
+        <project.build.outputTimestamp>${git.commit.time}
+        </project.build.outputTimestamp><!-- populated via git-commit-id-plugin -->
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
-        <maven-surefire-plugin.version>3.0.0-M5</maven-surefire-plugin.version>
-        <maven-failsafe-plugin.version>3.0.0-M5</maven-failsafe-plugin.version>
-        <lombok.version>1.18.30</lombok.version>
+        <maven-surefire-plugin.version>3.5.4</maven-surefire-plugin.version>
+        <maven-failsafe-plugin.version>3.5.4</maven-failsafe-plugin.version>
+        <lombok.version>1.18.40</lombok.version>
    </properties>

    <developers>
@@ -46,10 +47,23 @@

    <build>
        <plugins>
+            <plugin>
+                <groupId>org.apache.maven.plugins</groupId>
+                <artifactId>maven-compiler-plugin</artifactId>
+                <configuration>
+                    <annotationProcessorPaths>
+                        <path>
+                            <groupId>org.projectlombok</groupId>
+                            <artifactId>lombok</artifactId>
+                            <version>${lombok.version}</version>
+                        </path>
+                    </annotationProcessorPaths>
+                </configuration>
+            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-source-plugin</artifactId>
-                <version>3.3.0</version>
+                <version>3.3.1</version>
                <executions>
                    <execution>
                        <id>attach-sources</id>
@@ -62,13 +76,21 @@
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-javadoc-plugin</artifactId>
-                <version>3.5.0</version>
+                <version>3.12.0</version>
+                <configuration>
+                    <!-- to disable the "missing" warnings. Remove the doclint to enable warnings-->
+                    <doclint>all,-missing</doclint>
+                </configuration>
                <executions>
                    <execution>
                        <id>attach-javadocs</id>
+                        <phase>package</phase>
                        <goals>
                            <goal>jar</goal>
                        </goals>
+                        <configuration>
+                            <outputDirectory>${project.build.directory}</outputDirectory>
+                        </configuration>
                    </execution>
                </executions>
            </plugin>
@@ -110,11 +132,10 @@
                </executions>
            </plugin>

-
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-gpg-plugin</artifactId>
-                <version>1.5</version>
+                <version>3.2.8</version>
                <executions>
                    <execution>
                        <id>sign-artifacts</id>
@@ -126,7 +147,99 @@
                </executions>
            </plugin>

+            <plugin>
+                <groupId>io.github.git-commit-id</groupId>
+                <artifactId>git-commit-id-maven-plugin</artifactId>
+                <version>9.0.2</version>
+                <executions>
+                    <execution>
+                        <goals>
+                            <goal>revision</goal>
+                        </goals>
+                    </execution>
+                </executions>
+                <configuration>
+                    <dateFormat>yyyy-MM-dd'T'HH:mm:ss'Z'</dateFormat>
+                    <dateFormatTimeZone>Etc/UTC</dateFormatTimeZone>
+                </configuration>
+            </plugin>
+
+            <plugin>
+                <groupId>com.diffplug.spotless</groupId>
+                <artifactId>spotless-maven-plugin</artifactId>
+                <version>3.0.0</version>
+                <configuration>
+                    <formats>
+                        <!-- you can define as many formats as you want, each is independent -->
+                        <format>
+                            <!-- define the files to apply to -->
+                            <includes>
+                                <include>.gitattributes</include>
+                                <include>.gitignore</include>
+                            </includes>
+                            <!-- define the steps to apply to those files -->
+                            <trimTrailingWhitespace/>
+                            <endWithNewline/>
+                            <indent>
+                                <tabs>true</tabs>
+                                <spacesPerTab>4</spacesPerTab>
+                            </indent>
+                        </format>
+                    </formats>
+                    <!-- define a language-specific format -->
+                    <java>
+                        <!-- no need to specify files, inferred automatically, but you can if you want -->
+
+                        <!-- apply a specific flavor of google-java-format and reflow long strings -->
+                        <googleJavaFormat>
+                            <version>1.28.0</version>
+                            <style>AOSP</style>
+                            <reflowLongStrings>true</reflowLongStrings>
+                            <formatJavadoc>false</formatJavadoc>
+                        </googleJavaFormat>
+
+                        <!-- make sure every file has the following copyright header.
+                          optionally, Spotless can set copyright years by digging
+                          through git history (see "license" section below) -->
+                        <licenseHeader>
+                            <content>
+<![CDATA[
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) $YEAR Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
+]]>
+                            </content>  <!-- or <file>${project.basedir}/license-header</file> -->
+                        </licenseHeader>
+                    </java>
+                </configuration>
+                <executions>
+                    <execution>
+                        <goals>
+                            <goal>check</goal>
+                        </goals>
+                        <phase>compile</phase>
+                    </execution>
+                </executions>
+            </plugin>
        </plugins>
+
+        <pluginManagement>
+            <plugins>
+                <plugin>
+                    <artifactId>maven-compiler-plugin</artifactId>
+                    <version>3.14.1</version>
+                </plugin>
+                <plugin>
+                    <artifactId>maven-jar-plugin</artifactId>
+                    <version>3.4.2</version>
+                </plugin>
+            </plugins>
+        </pluginManagement>
    </build>

    <dependencies>
@@ -136,45 +249,78 @@
            <version>${lombok.version}</version>
            <scope>provided</scope>
        </dependency>
+        <dependency>
+            <groupId>org.jsoup</groupId>
+            <artifactId>jsoup</artifactId>
+            <version>1.21.2</version>
+        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.core</groupId>
            <artifactId>jackson-databind</artifactId>
-            <version>2.17.1</version>
+            <version>2.20.0</version>
+        </dependency>
+        <dependency>
+            <groupId>com.fasterxml.jackson.dataformat</groupId>
+            <artifactId>jackson-dataformat-yaml</artifactId>
+            <version>2.20.0</version>
        </dependency>
        <dependency>
            <groupId>com.fasterxml.jackson.datatype</groupId>
            <artifactId>jackson-datatype-jsr310</artifactId>
-            <version>2.17.1</version>
+            <version>2.20.0</version>
        </dependency>
        <dependency>
            <groupId>ch.qos.logback</groupId>
            <artifactId>logback-classic</artifactId>
-            <version>1.5.6</version>
+            <version>1.5.18</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
-            <version>2.0.9</version>
+            <version>2.0.17</version>
        </dependency>
        <dependency>
            <groupId>org.junit.jupiter</groupId>
            <artifactId>junit-jupiter-api</artifactId>
-            <version>5.10.0</version>
+            <version>5.13.4</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.mockito</groupId>
            <artifactId>mockito-core</artifactId>
-            <version>4.1.0</version>
+            <version>5.20.0</version>
            <scope>test</scope>
        </dependency>
        <dependency>
            <groupId>org.json</groupId>
            <artifactId>json</artifactId>
-            <version>20240205</version>
+            <version>20250517</version>
            <scope>test</scope>
        </dependency>
+        <dependency>
+            <groupId>org.testcontainers</groupId>
+            <artifactId>ollama</artifactId>
+            <version>1.20.2</version>
+            <scope>test</scope>
+        </dependency>
+        <dependency>
+            <groupId>org.testcontainers</groupId>
+            <artifactId>nginx</artifactId>
+            <version>1.21.3</version>
+            <scope>test</scope>
+        </dependency>
+        <!-- Prometheus metrics dependencies -->
+        <dependency>
+            <groupId>io.prometheus</groupId>
+            <artifactId>simpleclient</artifactId>
+            <version>0.16.0</version>
+        </dependency>
+        <dependency>
+            <groupId>com.google.guava</groupId>
+            <artifactId>guava</artifactId>
+            <version>33.5.0-jre</version>
+        </dependency>
    </dependencies>

    <distributionManagement>
@@ -200,7 +346,7 @@
                    <plugin>
                        <groupId>org.sonatype.central</groupId>
                        <artifactId>central-publishing-maven-plugin</artifactId>
-                        <version>0.5.0</version>
+                        <version>0.8.0</version>
                        <extensions>true</extensions>
                        <configuration>
                            <publishingServerId>mvn-repo-id</publishingServerId>
@@ -216,6 +362,7 @@
                <test.env>unit</test.env>
                <skipUnitTests>false</skipUnitTests>
                <skipIntegrationTests>true</skipIntegrationTests>
+                <skipGpgPluginDuringTests>true</skipGpgPluginDuringTests>
            </properties>
            <activation>
                <activeByDefault>false</activeByDefault>
@@ -225,7 +372,7 @@
                    <plugin>
                        <groupId>org.jacoco</groupId>
                        <artifactId>jacoco-maven-plugin</artifactId>
-                        <version>0.8.11</version>
+                        <version>0.8.13</version>
                        <executions>
                            <execution>
                                <goals>
@@ -241,6 +388,23 @@
                            </execution>
                        </executions>
                    </plugin>
+                    <plugin>
+                        <groupId>org.apache.maven.plugins</groupId>
+                        <artifactId>maven-gpg-plugin</artifactId>
+                        <version>3.2.8</version>
+                        <executions>
+                            <execution>
+                                <id>sign-artifacts</id>
+                                <phase>verify</phase>
+                                <goals>
+                                    <goal>sign</goal>
+                                </goals>
+                                <configuration>
+                                    <skip>${skipGpgPluginDuringTests}</skip>
+                                </configuration>
+                            </execution>
+                        </executions>
+                    </plugin>
                </plugins>
            </build>
        </profile>
@@ -250,7 +414,29 @@
                <test.env>integration</test.env>
                <skipUnitTests>true</skipUnitTests>
                <skipIntegrationTests>false</skipIntegrationTests>
+                <skipGpgPluginDuringTests>true</skipGpgPluginDuringTests>
            </properties>
+            <build>
+                <plugins>
+                    <plugin>
+                        <groupId>org.apache.maven.plugins</groupId>
+                        <artifactId>maven-gpg-plugin</artifactId>
+                        <version>3.2.8</version>
+                        <executions>
+                            <execution>
+                                <id>sign-artifacts</id>
+                                <phase>verify</phase>
+                                <goals>
+                                    <goal>sign</goal>
+                                </goals>
+                                <configuration>
+                                    <skip>${skipGpgPluginDuringTests}</skip>
+                                </configuration>
+                            </execution>
+                        </executions>
+                    </plugin>
+                </plugins>
+            </build>
        </profile>
        <profile>
            <id>ci-cd</id>
@@ -264,7 +450,7 @@
                    <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-gpg-plugin</artifactId>
-                        <version>3.1.0</version>
+                        <version>3.2.8</version>
                        <executions>
                            <execution>
                                <id>sign-artifacts</id>
@@ -286,7 +472,7 @@
                    <plugin>
                        <groupId>org.sonatype.plugins</groupId>
                        <artifactId>nexus-staging-maven-plugin</artifactId>
-                        <version>1.6.13</version>
+                        <version>1.7.0</version>
                        <extensions>true</extensions>
                        <configuration>
                            <serverId>ossrh</serverId>
@@ -294,11 +480,10 @@
                            <autoReleaseAfterClose>true</autoReleaseAfterClose>
                        </configuration>
                    </plugin>
-
                    <plugin>
                        <groupId>org.jacoco</groupId>
                        <artifactId>jacoco-maven-plugin</artifactId>
-                        <version>0.8.7</version>
+                        <version>0.8.13</version>
                        <executions>
                            <execution>
                                <goals>
@@ -319,4 +504,4 @@
        </profile>
    </profiles>

-</project>
+</project>
--- a/src/main/java/io/github/ollama4j/Ollama.java
+++ b/src/main/java/io/github/ollama4j/Ollama.java
--- a/src/main/java/io/github/ollama4j/OllamaAPI.java
+++ b/src/main/java/io/github/ollama4j/OllamaAPI.java
@@ -1,682 +0,0 @@
-package io.github.ollama4j;
-
-import io.github.ollama4j.exceptions.OllamaBaseException;
-import io.github.ollama4j.exceptions.ToolInvocationException;
-import io.github.ollama4j.exceptions.ToolNotFoundException;
-import io.github.ollama4j.models.chat.OllamaChatMessage;
-import io.github.ollama4j.models.chat.OllamaChatRequest;
-import io.github.ollama4j.models.chat.OllamaChatRequestBuilder;
-import io.github.ollama4j.models.chat.OllamaChatResult;
-import io.github.ollama4j.models.embeddings.OllamaEmbeddingResponseModel;
-import io.github.ollama4j.models.embeddings.OllamaEmbeddingsRequestModel;
-import io.github.ollama4j.models.generate.OllamaGenerateRequest;
-import io.github.ollama4j.models.generate.OllamaStreamHandler;
-import io.github.ollama4j.models.ps.ModelsProcessResponse;
-import io.github.ollama4j.models.request.*;
-import io.github.ollama4j.models.response.*;
-import io.github.ollama4j.tools.*;
-import io.github.ollama4j.utils.Options;
-import io.github.ollama4j.utils.Utils;
-import lombok.Setter;
-import org.slf4j.Logger;
-import org.slf4j.LoggerFactory;
-
-import java.io.*;
-import java.net.URI;
-import java.net.URISyntaxException;
-import java.net.http.HttpClient;
-import java.net.http.HttpConnectTimeoutException;
-import java.net.http.HttpRequest;
-import java.net.http.HttpResponse;
-import java.nio.charset.StandardCharsets;
-import java.nio.file.Files;
-import java.time.Duration;
-import java.util.*;
-
-/**
- * The base Ollama API class.
- */
-@SuppressWarnings("DuplicatedCode")
-public class OllamaAPI {
-
-    private static final Logger logger = LoggerFactory.getLogger(OllamaAPI.class);
-    private final String host;
-    /**
-     * -- SETTER --
-     * Set request timeout in seconds. Default is 3 seconds.
-     */
-    @Setter
-    private long requestTimeoutSeconds = 10;
-    /**
-     * -- SETTER --
-     * Set/unset logging of responses
-     */
-    @Setter
-    private boolean verbose = true;
-    private BasicAuth basicAuth;
-
-    private final ToolRegistry toolRegistry = new ToolRegistry();
-
-    /**
-     * Instantiates the Ollama API with default Ollama host: <a href="http://localhost:11434">http://localhost:11434</a>
-     **/
-    public OllamaAPI() {
-        this.host = "http://localhost:11434";
-    }
-
-    /**
-     * Instantiates the Ollama API with specified Ollama host address.
-     *
-     * @param host the host address of Ollama server
-     */
-    public OllamaAPI(String host) {
-        if (host.endsWith("/")) {
-            this.host = host.substring(0, host.length() - 1);
-        } else {
-            this.host = host;
-        }
-    }
-
-    /**
-     * Set basic authentication for accessing Ollama server that's behind a reverse-proxy/gateway.
-     *
-     * @param username the username
-     * @param password the password
-     */
-    public void setBasicAuth(String username, String password) {
-        this.basicAuth = new BasicAuth(username, password);
-    }
-
-    /**
-     * API to check the reachability of Ollama server.
-     *
-     * @return true if the server is reachable, false otherwise.
-     */
-    public boolean ping() {
-        String url = this.host + "/api/tags";
-        HttpClient httpClient = HttpClient.newHttpClient();
-        HttpRequest httpRequest = null;
-        try {
-            httpRequest =
-                    getRequestBuilderDefault(new URI(url))
-                            .header("Accept", "application/json")
-                            .header("Content-type", "application/json")
-                            .GET()
-                            .build();
-        } catch (URISyntaxException e) {
-            throw new RuntimeException(e);
-        }
-        HttpResponse<String> response = null;
-        try {
-            response = httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString());
-        } catch (HttpConnectTimeoutException e) {
-            return false;
-        } catch (IOException | InterruptedException e) {
-            throw new RuntimeException(e);
-        }
-        int statusCode = response.statusCode();
-        return statusCode == 200;
-    }
-
-    /**
-     * Provides a list of running models and details about each model currently loaded into memory.
-     *
-     * @return ModelsProcessResponse
-     */
-    public ModelsProcessResponse ps() throws IOException, InterruptedException, OllamaBaseException {
-        String url = this.host + "/api/ps";
-        HttpClient httpClient = HttpClient.newHttpClient();
-        HttpRequest httpRequest = null;
-        try {
-            httpRequest =
-                    getRequestBuilderDefault(new URI(url))
-                            .header("Accept", "application/json")
-                            .header("Content-type", "application/json")
-                            .GET()
-                            .build();
-        } catch (URISyntaxException e) {
-            throw new RuntimeException(e);
-        }
-        HttpResponse<String> response = null;
-        response = httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString());
-        int statusCode = response.statusCode();
-        String responseString = response.body();
-        if (statusCode == 200) {
-            return Utils.getObjectMapper()
-                    .readValue(responseString, ModelsProcessResponse.class);
-        } else {
-            throw new OllamaBaseException(statusCode + " - " + responseString);
-        }
-    }
-
-    /**
-     * List available models from Ollama server.
-     *
-     * @return the list
-     */
-    public List<Model> listModels()
-            throws OllamaBaseException, IOException, InterruptedException, URISyntaxException {
-        String url = this.host + "/api/tags";
-        HttpClient httpClient = HttpClient.newHttpClient();
-        HttpRequest httpRequest =
-                getRequestBuilderDefault(new URI(url))
-                        .header("Accept", "application/json")
-                        .header("Content-type", "application/json")
-                        .GET()
-                        .build();
-        HttpResponse<String> response =
-                httpClient.send(httpRequest, HttpResponse.BodyHandlers.ofString());
-        int statusCode = response.statusCode();
-        String responseString = response.body();
-        if (statusCode == 200) {
-            return Utils.getObjectMapper()
-                    .readValue(responseString, ListModelsResponse.class)
-                    .getModels();
-        } else {
-            throw new OllamaBaseException(statusCode + " - " + responseString);
-        }
-    }
-
-    /**
-     * Pull a model on the Ollama server from the list of <a
-     * href="https://ollama.ai/library">available models</a>.
-     *
-     * @param modelName the name of the model
-     */
-    public void pullModel(String modelName)
-            throws OllamaBaseException, IOException, URISyntaxException, InterruptedException {
-        String url = this.host + "/api/pull";
-        String jsonData = new ModelRequest(modelName).toString();
-        HttpRequest request =
-                getRequestBuilderDefault(new URI(url))
-                        .POST(HttpRequest.BodyPublishers.ofString(jsonData))
-                        .header("Accept", "application/json")
-                        .header("Content-type", "application/json")
-                        .build();
-        HttpClient client = HttpClient.newHttpClient();
-        HttpResponse<InputStream> response =
-                client.send(request, HttpResponse.BodyHandlers.ofInputStream());
-        int statusCode = response.statusCode();
-        InputStream responseBodyStream = response.body();
-        String responseString = "";
-        try (BufferedReader reader =
-                     new BufferedReader(new InputStreamReader(responseBodyStream, StandardCharsets.UTF_8))) {
-            String line;
-            while ((line = reader.readLine()) != null) {
-                ModelPullResponse modelPullResponse =
-                        Utils.getObjectMapper().readValue(line, ModelPullResponse.class);
-                if (verbose) {
-                    logger.info(modelPullResponse.getStatus());
-                }
-            }
-        }
-        if (statusCode != 200) {
-            throw new OllamaBaseException(statusCode + " - " + responseString);
-        }
-    }
-
-    /**
-     * Gets model details from the Ollama server.
-     *
-     * @param modelName the model
-     * @return the model details
-     */
-    public ModelDetail getModelDetails(String modelName)
-            throws IOException, OllamaBaseException, InterruptedException, URISyntaxException {
-        String url = this.host + "/api/show";
-        String jsonData = new ModelRequest(modelName).toString();
-        HttpRequest request =
-                getRequestBuilderDefault(new URI(url))
-                        .header("Accept", "application/json")
-                        .header("Content-type", "application/json")
-                        .POST(HttpRequest.BodyPublishers.ofString(jsonData))
-                        .build();
-        HttpClient client = HttpClient.newHttpClient();
-        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
-        int statusCode = response.statusCode();
-        String responseBody = response.body();
-        if (statusCode == 200) {
-            return Utils.getObjectMapper().readValue(responseBody, ModelDetail.class);
-        } else {
-            throw new OllamaBaseException(statusCode + " - " + responseBody);
-        }
-    }
-
-    /**
-     * Create a custom model from a model file. Read more about custom model file creation <a
-     * href="https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md">here</a>.
-     *
-     * @param modelName     the name of the custom model to be created.
-     * @param modelFilePath the path to model file that exists on the Ollama server.
-     */
-    public void createModelWithFilePath(String modelName, String modelFilePath)
-            throws IOException, InterruptedException, OllamaBaseException, URISyntaxException {
-        String url = this.host + "/api/create";
-        String jsonData = new CustomModelFilePathRequest(modelName, modelFilePath).toString();
-        HttpRequest request =
-                getRequestBuilderDefault(new URI(url))
-                        .header("Accept", "application/json")
-                        .header("Content-Type", "application/json")
-                        .POST(HttpRequest.BodyPublishers.ofString(jsonData, StandardCharsets.UTF_8))
-                        .build();
-        HttpClient client = HttpClient.newHttpClient();
-        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
-        int statusCode = response.statusCode();
-        String responseString = response.body();
-        if (statusCode != 200) {
-            throw new OllamaBaseException(statusCode + " - " + responseString);
-        }
-        // FIXME: Ollama API returns HTTP status code 200 for model creation failure cases. Correct this
-        // if the issue is fixed in the Ollama API server.
-        if (responseString.contains("error")) {
-            throw new OllamaBaseException(responseString);
-        }
-        if (verbose) {
-            logger.info(responseString);
-        }
-    }
-
-    /**
-     * Create a custom model from a model file. Read more about custom model file creation <a
-     * href="https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md">here</a>.
-     *
-     * @param modelName         the name of the custom model to be created.
-     * @param modelFileContents the path to model file that exists on the Ollama server.
-     */
-    public void createModelWithModelFileContents(String modelName, String modelFileContents)
-            throws IOException, InterruptedException, OllamaBaseException, URISyntaxException {
-        String url = this.host + "/api/create";
-        String jsonData = new CustomModelFileContentsRequest(modelName, modelFileContents).toString();
-        HttpRequest request =
-                getRequestBuilderDefault(new URI(url))
-                        .header("Accept", "application/json")
-                        .header("Content-Type", "application/json")
-                        .POST(HttpRequest.BodyPublishers.ofString(jsonData, StandardCharsets.UTF_8))
-                        .build();
-        HttpClient client = HttpClient.newHttpClient();
-        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
-        int statusCode = response.statusCode();
-        String responseString = response.body();
-        if (statusCode != 200) {
-            throw new OllamaBaseException(statusCode + " - " + responseString);
-        }
-        if (responseString.contains("error")) {
-            throw new OllamaBaseException(responseString);
-        }
-        if (verbose) {
-            logger.info(responseString);
-        }
-    }
-
-    /**
-     * Delete a model from Ollama server.
-     *
-     * @param modelName          the name of the model to be deleted.
-     * @param ignoreIfNotPresent ignore errors if the specified model is not present on Ollama server.
-     */
-    public void deleteModel(String modelName, boolean ignoreIfNotPresent)
-            throws IOException, InterruptedException, OllamaBaseException, URISyntaxException {
-        String url = this.host + "/api/delete";
-        String jsonData = new ModelRequest(modelName).toString();
-        HttpRequest request =
-                getRequestBuilderDefault(new URI(url))
-                        .method("DELETE", HttpRequest.BodyPublishers.ofString(jsonData, StandardCharsets.UTF_8))
-                        .header("Accept", "application/json")
-                        .header("Content-type", "application/json")
-                        .build();
-        HttpClient client = HttpClient.newHttpClient();
-        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
-        int statusCode = response.statusCode();
-        String responseBody = response.body();
-        if (statusCode == 404 && responseBody.contains("model") && responseBody.contains("not found")) {
-            return;
-        }
-        if (statusCode != 200) {
-            throw new OllamaBaseException(statusCode + " - " + responseBody);
-        }
-    }
-
-    /**
-     * Generate embeddings for a given text from a model
-     *
-     * @param model  name of model to generate embeddings from
-     * @param prompt text to generate embeddings for
-     * @return embeddings
-     */
-    public List<Double> generateEmbeddings(String model, String prompt)
-            throws IOException, InterruptedException, OllamaBaseException {
-        return generateEmbeddings(new OllamaEmbeddingsRequestModel(model, prompt));
-    }
-
-    /**
-     * Generate embeddings using a {@link OllamaEmbeddingsRequestModel}.
-     *
-     * @param modelRequest request for '/api/embeddings' endpoint
-     * @return embeddings
-     */
-    public List<Double> generateEmbeddings(OllamaEmbeddingsRequestModel modelRequest) throws IOException, InterruptedException, OllamaBaseException {
-        URI uri = URI.create(this.host + "/api/embeddings");
-        String jsonData = modelRequest.toString();
-        HttpClient httpClient = HttpClient.newHttpClient();
-        HttpRequest.Builder requestBuilder =
-                getRequestBuilderDefault(uri)
-                        .header("Accept", "application/json")
-                        .POST(HttpRequest.BodyPublishers.ofString(jsonData));
-        HttpRequest request = requestBuilder.build();
-        HttpResponse<String> response = httpClient.send(request, HttpResponse.BodyHandlers.ofString());
-        int statusCode = response.statusCode();
-        String responseBody = response.body();
-        if (statusCode == 200) {
-            OllamaEmbeddingResponseModel embeddingResponse =
-                    Utils.getObjectMapper().readValue(responseBody, OllamaEmbeddingResponseModel.class);
-            return embeddingResponse.getEmbedding();
-        } else {
-            throw new OllamaBaseException(statusCode + " - " + responseBody);
-        }
-    }
-
-
-    /**
-     * Generate response for a question to a model running on Ollama server. This is a sync/blocking
-     * call.
-     *
-     * @param model         the ollama model to ask the question to
-     * @param prompt        the prompt/question text
-     * @param options       the Options object - <a
-     *                      href="https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values">More
-     *                      details on the options</a>
-     * @param streamHandler optional callback consumer that will be applied every time a streamed response is received. If not set, the stream parameter of the request is set to false.
-     * @return OllamaResult that includes response text and time taken for response
-     */
-    public OllamaResult generate(String model, String prompt, boolean raw, Options options, OllamaStreamHandler streamHandler)
-            throws OllamaBaseException, IOException, InterruptedException {
-        OllamaGenerateRequest ollamaRequestModel = new OllamaGenerateRequest(model, prompt);
-        ollamaRequestModel.setRaw(raw);
-        ollamaRequestModel.setOptions(options.getOptionsMap());
-        return generateSyncForOllamaRequestModel(ollamaRequestModel, streamHandler);
-    }
-
-    /**
-     * Generates response using the specified AI model and prompt (in blocking mode).
-     * <p>
-     * Uses {@link #generate(String, String, boolean, Options, OllamaStreamHandler)}
-     *
-     * @param model   The name or identifier of the AI model to use for generating the response.
-     * @param prompt  The input text or prompt to provide to the AI model.
-     * @param raw     In some cases, you may wish to bypass the templating system and provide a full prompt. In this case, you can use the raw parameter to disable templating. Also note that raw mode will not return a context.
-     * @param options Additional options or configurations to use when generating the response.
-     * @return {@link OllamaResult}
-     */
-    public OllamaResult generate(String model, String prompt, boolean raw, Options options)
-            throws OllamaBaseException, IOException, InterruptedException {
-        return generate(model, prompt, raw, options, null);
-    }
-
-
-    /**
-     * Generates response using the specified AI model and prompt (in blocking mode), and then invokes a set of tools
-     * on the generated response.
-     *
-     * @param model   The name or identifier of the AI model to use for generating the response.
-     * @param prompt  The input text or prompt to provide to the AI model.
-     * @param options Additional options or configurations to use when generating the response.
-     * @return {@link OllamaToolsResult} An OllamaToolsResult object containing the response from the AI model and the results of invoking the tools on that output.
-     * @throws OllamaBaseException  If there is an error related to the Ollama API or service.
-     * @throws IOException          If there is an error related to input/output operations.
-     * @throws InterruptedException If the method is interrupted while waiting for the AI model
-     *                              to generate the response or for the tools to be invoked.
-     */
-    public OllamaToolsResult generateWithTools(String model, String prompt, Options options)
-            throws OllamaBaseException, IOException, InterruptedException, ToolInvocationException {
-        boolean raw = true;
-        OllamaToolsResult toolResult = new OllamaToolsResult();
-        Map<ToolFunctionCallSpec, Object> toolResults = new HashMap<>();
-
-        OllamaResult result = generate(model, prompt, raw, options, null);
-        toolResult.setModelResult(result);
-
-        String toolsResponse = result.getResponse();
-        if (toolsResponse.contains("[TOOL_CALLS]")) {
-            toolsResponse = toolsResponse.replace("[TOOL_CALLS]", "");
-        }
-
-        List<ToolFunctionCallSpec> toolFunctionCallSpecs = Utils.getObjectMapper().readValue(toolsResponse, Utils.getObjectMapper().getTypeFactory().constructCollectionType(List.class, ToolFunctionCallSpec.class));
-        for (ToolFunctionCallSpec toolFunctionCallSpec : toolFunctionCallSpecs) {
-            toolResults.put(toolFunctionCallSpec, invokeTool(toolFunctionCallSpec));
-        }
-        toolResult.setToolResults(toolResults);
-        return toolResult;
-    }
-
-
-    /**
-     * Generate response for a question to a model running on Ollama server and get a callback handle
-     * that can be used to check for status and get the response from the model later. This would be
-     * an async/non-blocking call.
-     *
-     * @param model  the ollama model to ask the question to
-     * @param prompt the prompt/question text
-     * @return the ollama async result callback handle
-     */
-    public OllamaAsyncResultStreamer generateAsync(String model, String prompt, boolean raw) {
-        OllamaGenerateRequest ollamaRequestModel = new OllamaGenerateRequest(model, prompt);
-        ollamaRequestModel.setRaw(raw);
-        URI uri = URI.create(this.host + "/api/generate");
-        OllamaAsyncResultStreamer ollamaAsyncResultStreamer =
-                new OllamaAsyncResultStreamer(
-                        getRequestBuilderDefault(uri), ollamaRequestModel, requestTimeoutSeconds);
-        ollamaAsyncResultStreamer.start();
-        return ollamaAsyncResultStreamer;
-    }
-
-    /**
-     * With one or more image files, ask a question to a model running on Ollama server. This is a
-     * sync/blocking call.
-     *
-     * @param model         the ollama model to ask the question to
-     * @param prompt        the prompt/question text
-     * @param imageFiles    the list of image files to use for the question
-     * @param options       the Options object - <a
-     *                      href="https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values">More
-     *                      details on the options</a>
-     * @param streamHandler optional callback consumer that will be applied every time a streamed response is received. If not set, the stream parameter of the request is set to false.
-     * @return OllamaResult that includes response text and time taken for response
-     */
-    public OllamaResult generateWithImageFiles(
-            String model, String prompt, List<File> imageFiles, Options options, OllamaStreamHandler streamHandler)
-            throws OllamaBaseException, IOException, InterruptedException {
-        List<String> images = new ArrayList<>();
-        for (File imageFile : imageFiles) {
-            images.add(encodeFileToBase64(imageFile));
-        }
-        OllamaGenerateRequest ollamaRequestModel = new OllamaGenerateRequest(model, prompt, images);
-        ollamaRequestModel.setOptions(options.getOptionsMap());
-        return generateSyncForOllamaRequestModel(ollamaRequestModel, streamHandler);
-    }
-
-    /**
-     * Convenience method to call Ollama API without streaming responses.
-     * <p>
-     * Uses {@link #generateWithImageFiles(String, String, List, Options, OllamaStreamHandler)}
-     */
-    public OllamaResult generateWithImageFiles(
-            String model, String prompt, List<File> imageFiles, Options options)
-            throws OllamaBaseException, IOException, InterruptedException {
-        return generateWithImageFiles(model, prompt, imageFiles, options, null);
-    }
-
-    /**
-     * With one or more image URLs, ask a question to a model running on Ollama server. This is a
-     * sync/blocking call.
-     *
-     * @param model         the ollama model to ask the question to
-     * @param prompt        the prompt/question text
-     * @param imageURLs     the list of image URLs to use for the question
-     * @param options       the Options object - <a
-     *                      href="https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values">More
-     *                      details on the options</a>
-     * @param streamHandler optional callback consumer that will be applied every time a streamed response is received. If not set, the stream parameter of the request is set to false.
-     * @return OllamaResult that includes response text and time taken for response
-     */
-    public OllamaResult generateWithImageURLs(
-            String model, String prompt, List<String> imageURLs, Options options, OllamaStreamHandler streamHandler)
-            throws OllamaBaseException, IOException, InterruptedException, URISyntaxException {
-        List<String> images = new ArrayList<>();
-        for (String imageURL : imageURLs) {
-            images.add(encodeByteArrayToBase64(Utils.loadImageBytesFromUrl(imageURL)));
-        }
-        OllamaGenerateRequest ollamaRequestModel = new OllamaGenerateRequest(model, prompt, images);
-        ollamaRequestModel.setOptions(options.getOptionsMap());
-        return generateSyncForOllamaRequestModel(ollamaRequestModel, streamHandler);
-    }
-
-    /**
-     * Convenience method to call Ollama API without streaming responses.
-     * <p>
-     * Uses {@link #generateWithImageURLs(String, String, List, Options, OllamaStreamHandler)}
-     */
-    public OllamaResult generateWithImageURLs(String model, String prompt, List<String> imageURLs,
-                                              Options options)
-            throws OllamaBaseException, IOException, InterruptedException, URISyntaxException {
-        return generateWithImageURLs(model, prompt, imageURLs, options, null);
-    }
-
-
-    /**
-     * Ask a question to a model based on a given message stack (i.e. a chat history). Creates a synchronous call to the api
-     * 'api/chat'.
-     *
-     * @param model    the ollama model to ask the question to
-     * @param messages chat history / message stack to send to the model
-     * @return {@link OllamaChatResult} containing the api response and the message history including the newly aqcuired assistant response.
-     * @throws OllamaBaseException  any response code than 200 has been returned
-     * @throws IOException          in case the responseStream can not be read
-     * @throws InterruptedException in case the server is not reachable or network issues happen
-     */
-    public OllamaChatResult chat(String model, List<OllamaChatMessage> messages) throws OllamaBaseException, IOException, InterruptedException {
-        OllamaChatRequestBuilder builder = OllamaChatRequestBuilder.getInstance(model);
-        return chat(builder.withMessages(messages).build());
-    }
-
-    /**
-     * Ask a question to a model using an {@link OllamaChatRequest}. This can be constructed using an {@link OllamaChatRequestBuilder}.
-     * <p>
-     * Hint: the OllamaChatRequestModel#getStream() property is not implemented.
-     *
-     * @param request request object to be sent to the server
-     * @return {@link OllamaChatResult}
-     * @throws OllamaBaseException  any response code than 200 has been returned
-     * @throws IOException          in case the responseStream can not be read
-     * @throws InterruptedException in case the server is not reachable or network issues happen
-     */
-    public OllamaChatResult chat(OllamaChatRequest request) throws OllamaBaseException, IOException, InterruptedException {
-        return chat(request, null);
-    }
-
-    /**
-     * Ask a question to a model using an {@link OllamaChatRequest}. This can be constructed using an {@link OllamaChatRequestBuilder}.
-     * <p>
-     * Hint: the OllamaChatRequestModel#getStream() property is not implemented.
-     *
-     * @param request       request object to be sent to the server
-     * @param streamHandler callback handler to handle the last message from stream (caution: all previous messages from stream will be concatenated)
-     * @return {@link OllamaChatResult}
-     * @throws OllamaBaseException  any response code than 200 has been returned
-     * @throws IOException          in case the responseStream can not be read
-     * @throws InterruptedException in case the server is not reachable or network issues happen
-     */
-    public OllamaChatResult chat(OllamaChatRequest request, OllamaStreamHandler streamHandler) throws OllamaBaseException, IOException, InterruptedException {
-        OllamaChatEndpointCaller requestCaller = new OllamaChatEndpointCaller(host, basicAuth, requestTimeoutSeconds, verbose);
-        OllamaResult result;
-        if (streamHandler != null) {
-            request.setStream(true);
-            result = requestCaller.call(request, streamHandler);
-        } else {
-            result = requestCaller.callSync(request);
-        }
-        return new OllamaChatResult(result.getResponse(), result.getResponseTime(), result.getHttpStatusCode(), request.getMessages());
-    }
-
-    public void registerTool(Tools.ToolSpecification toolSpecification) {
-        toolRegistry.addFunction(toolSpecification.getFunctionName(), toolSpecification.getToolDefinition());
-    }
-
-    // technical private methods //
-
-    private static String encodeFileToBase64(File file) throws IOException {
-        return Base64.getEncoder().encodeToString(Files.readAllBytes(file.toPath()));
-    }
-
-    private static String encodeByteArrayToBase64(byte[] bytes) {
-        return Base64.getEncoder().encodeToString(bytes);
-    }
-
-    private OllamaResult generateSyncForOllamaRequestModel(
-            OllamaGenerateRequest ollamaRequestModel, OllamaStreamHandler streamHandler)
-            throws OllamaBaseException, IOException, InterruptedException {
-        OllamaGenerateEndpointCaller requestCaller =
-                new OllamaGenerateEndpointCaller(host, basicAuth, requestTimeoutSeconds, verbose);
-        OllamaResult result;
-        if (streamHandler != null) {
-            ollamaRequestModel.setStream(true);
-            result = requestCaller.call(ollamaRequestModel, streamHandler);
-        } else {
-            result = requestCaller.callSync(ollamaRequestModel);
-        }
-        return result;
-    }
-
-    /**
-     * Get default request builder.
-     *
-     * @param uri URI to get a HttpRequest.Builder
-     * @return HttpRequest.Builder
-     */
-    private HttpRequest.Builder getRequestBuilderDefault(URI uri) {
-        HttpRequest.Builder requestBuilder =
-                HttpRequest.newBuilder(uri)
-                        .header("Content-Type", "application/json")
-                        .timeout(Duration.ofSeconds(requestTimeoutSeconds));
-        if (isBasicAuthCredentialsSet()) {
-            requestBuilder.header("Authorization", getBasicAuthHeaderValue());
-        }
-        return requestBuilder;
-    }
-
-    /**
-     * Get basic authentication header value.
-     *
-     * @return basic authentication header value (encoded credentials)
-     */
-    private String getBasicAuthHeaderValue() {
-        String credentialsToEncode = basicAuth.getUsername() + ":" + basicAuth.getPassword();
-        return "Basic " + Base64.getEncoder().encodeToString(credentialsToEncode.getBytes());
-    }
-
-    /**
-     * Check if Basic Auth credentials set.
-     *
-     * @return true when Basic Auth credentials set
-     */
-    private boolean isBasicAuthCredentialsSet() {
-        return basicAuth != null;
-    }
-
-
-    private Object invokeTool(ToolFunctionCallSpec toolFunctionCallSpec) throws ToolInvocationException {
-        try {
-            String methodName = toolFunctionCallSpec.getName();
-            Map<String, Object> arguments = toolFunctionCallSpec.getArguments();
-            ToolFunction function = toolRegistry.getFunction(methodName);
-            if (verbose) {
-                logger.debug("Invoking function {} with arguments {}", methodName, arguments);
-            }
-            if (function == null) {
-                throw new ToolNotFoundException("No such tool: " + methodName);
-            }
-            return function.apply(arguments);
-        } catch (Exception e) {
-            throw new ToolInvocationException("Failed to invoke tool: " + toolFunctionCallSpec.getName(), e);
-        }
-    }
-}
--- a/src/main/java/io/github/ollama4j/agent/Agent.java
+++ b/src/main/java/io/github/ollama4j/agent/Agent.java
@@ -0,0 +1,318 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
+package io.github.ollama4j.agent;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import io.github.ollama4j.Ollama;
+import io.github.ollama4j.exceptions.OllamaException;
+import io.github.ollama4j.impl.ConsoleOutputGenerateTokenHandler;
+import io.github.ollama4j.models.chat.*;
+import io.github.ollama4j.tools.ToolFunction;
+import io.github.ollama4j.tools.Tools;
+import java.io.InputStream;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Scanner;
+import lombok.*;
+
+/**
+ * The {@code Agent} class represents an AI assistant capable of interacting with the Ollama API
+ * server.
+ *
+ * <p>It supports the use of tools (interchangeable code components), persistent chat history, and
+ * interactive as well as pre-scripted chat sessions.
+ *
+ * <h2>Usage</h2>
+ *
+ * <ul>
+ *   <li>Instantiate an Agent via {@link #load(String)} for YAML-based configuration.
+ *   <li>Handle conversation turns via {@link #interact(String, OllamaChatStreamObserver)}.
+ *   <li>Use {@link #runInteractive()} for an interactive console-based session.
+ * </ul>
+ */
+public class Agent {
+    /**
+     * The agent's display name
+     */
+    private final String name;
+
+    /**
+     * List of supported tools for this agent
+     */
+    private final List<Tools.Tool> tools;
+
+    /**
+     * Ollama client instance for communication with the API
+     */
+    private final Ollama ollamaClient;
+
+    /**
+     * The model name used for chat completions
+     */
+    private final String model;
+
+    /**
+     * Persists chat message history across rounds
+     */
+    private final List<OllamaChatMessage> chatHistory;
+
+    /**
+     * Optional custom system prompt for the agent
+     */
+    private final String customPrompt;
+
+    /**
+     * Constructs a new Agent.
+     *
+     * @param name         The agent's given name.
+     * @param ollamaClient The Ollama API client instance to use.
+     * @param model        The model name to use for chat completion.
+     * @param customPrompt A custom prompt to prepend to all conversations (may be null).
+     * @param tools        List of available tools for function calling.
+     */
+    public Agent(
+            String name,
+            Ollama ollamaClient,
+            String model,
+            String customPrompt,
+            List<Tools.Tool> tools) {
+        this.name = name;
+        this.ollamaClient = ollamaClient;
+        this.chatHistory = new ArrayList<>();
+        this.tools = tools;
+        this.model = model;
+        this.customPrompt = customPrompt;
+    }
+
+    /**
+     * Loads and constructs an Agent from a YAML configuration file (classpath or filesystem).
+     *
+     * <p>The YAML should define the agent, the model, and the desired tool functions (using their
+     * fully qualified class names for auto-discovery).
+     *
+     * @param yamlPathOrResource Path or classpath resource name of the YAML file.
+     * @return New Agent instance loaded according to the YAML definition.
+     * @throws RuntimeException if the YAML cannot be read or agent cannot be constructed.
+     */
+    public static Agent load(String yamlPathOrResource) {
+        try {
+            ObjectMapper mapper = new ObjectMapper(new YAMLFactory());
+
+            InputStream input =
+                    Agent.class.getClassLoader().getResourceAsStream(yamlPathOrResource);
+            if (input == null) {
+                java.nio.file.Path filePath = java.nio.file.Paths.get(yamlPathOrResource);
+                if (java.nio.file.Files.exists(filePath)) {
+                    input = java.nio.file.Files.newInputStream(filePath);
+                } else {
+                    throw new RuntimeException(
+                            yamlPathOrResource + " not found in classpath or file system");
+                }
+            }
+            AgentSpec agentSpec = mapper.readValue(input, AgentSpec.class);
+            List<AgentToolSpec> tools = agentSpec.getTools();
+            for (AgentToolSpec tool : tools) {
+                String fqcn = tool.getToolFunctionFQCN();
+                if (fqcn != null && !fqcn.isEmpty()) {
+                    try {
+                        Class<?> clazz = Class.forName(fqcn);
+                        Object instance = clazz.getDeclaredConstructor().newInstance();
+                        if (instance instanceof ToolFunction) {
+                            tool.setToolFunctionInstance((ToolFunction) instance);
+                        } else {
+                            throw new RuntimeException(
+                                    "Class does not implement ToolFunction: " + fqcn);
+                        }
+                    } catch (Exception e) {
+                        throw new RuntimeException(
+                                "Failed to instantiate tool function: " + fqcn, e);
+                    }
+                }
+            }
+            List<Tools.Tool> agentTools = new ArrayList<>();
+            for (AgentToolSpec a : tools) {
+                Tools.Tool t = new Tools.Tool();
+                t.setToolFunction(a.getToolFunctionInstance());
+                Tools.ToolSpec ts = new Tools.ToolSpec();
+                ts.setName(a.getName());
+                ts.setDescription(a.getDescription());
+                ts.setParameters(a.getParameters());
+                t.setToolSpec(ts);
+                agentTools.add(t);
+            }
+            Ollama ollama = new Ollama(agentSpec.getHost());
+            ollama.setRequestTimeoutSeconds(120);
+            return new Agent(
+                    agentSpec.getName(),
+                    ollama,
+                    agentSpec.getModel(),
+                    agentSpec.getCustomPrompt(),
+                    agentTools);
+        } catch (Exception e) {
+            throw new RuntimeException("Failed to load agent from YAML", e);
+        }
+    }
+
+    /**
+     * Facilitates a single round of chat for the agent:
+     *
+     * <ul>
+     *   <li>Builds/promotes the system prompt on the first turn if necessary
+     *   <li>Adds the user's input to chat history
+     *   <li>Submits the chat turn to the Ollama model (with tool/function support)
+     *   <li>Updates internal chat history in accordance with the Ollama chat result
+     * </ul>
+     *
+     * @param userInput The user's message or question for the agent.
+     * @return The model's response as a string.
+     * @throws OllamaException If there is a problem with the Ollama API.
+     */
+    public String interact(String userInput, OllamaChatStreamObserver chatTokenHandler)
+            throws OllamaException {
+        // Build a concise and readable description of available tools
+        String availableToolsDescription =
+                tools.isEmpty()
+                        ? ""
+                        : tools.stream()
+                                .map(
+                                        t ->
+                                                String.format(
+                                                        "- %s: %s",
+                                                        t.getToolSpec().getName(),
+                                                        t.getToolSpec().getDescription() != null
+                                                                ? t.getToolSpec().getDescription()
+                                                                : "No description"))
+                                .reduce((a, b) -> a + "\n" + b)
+                                .map(desc -> "\nYou have access to the following tools:\n" + desc)
+                                .orElse("");
+
+        // Add system prompt if chatHistory is empty
+        if (chatHistory.isEmpty()) {
+            String systemPrompt =
+                    String.format(
+                            "You are a helpful AI assistant named %s. Your actions are limited to"
+                                    + " using the available tools. %s%s",
+                            name,
+                            (customPrompt != null ? customPrompt : ""),
+                            availableToolsDescription);
+            chatHistory.add(new OllamaChatMessage(OllamaChatMessageRole.SYSTEM, systemPrompt));
+        }
+
+        // Add the user input as a message before sending request
+        chatHistory.add(new OllamaChatMessage(OllamaChatMessageRole.USER, userInput));
+
+        OllamaChatRequest request =
+                OllamaChatRequest.builder()
+                        .withTools(tools)
+                        .withUseTools(true)
+                        .withModel(model)
+                        .withMessages(chatHistory)
+                        .build();
+        OllamaChatResult response = ollamaClient.chat(request, chatTokenHandler);
+
+        // Update chat history for continuity
+        chatHistory.clear();
+        chatHistory.addAll(response.getChatHistory());
+
+        return response.getResponseModel().getMessage().getResponse();
+    }
+
+    /**
+     * Launches an endless interactive console session with the agent, echoing user input and the
+     * agent's response using the provided chat model and tools.
+     *
+     * <p>Type {@code exit} to break the loop and terminate the session.
+     *
+     * @throws OllamaException if any errors occur talking to the Ollama API.
+     */
+    public void runInteractive() throws OllamaException {
+        Scanner sc = new Scanner(System.in);
+        while (true) {
+            System.out.print("\n[You]: ");
+            String input = sc.nextLine();
+            if ("exit".equalsIgnoreCase(input)) break;
+            this.interact(
+                    input,
+                    new OllamaChatStreamObserver(
+                            new ConsoleOutputGenerateTokenHandler(),
+                            new ConsoleOutputGenerateTokenHandler()));
+        }
+    }
+
+    /**
+     * Bean describing an agent as definable from YAML.
+     *
+     * <ul>
+     *   <li>{@code name}: Agent display name
+     *   <li>{@code description}: Freeform description
+     *   <li>{@code tools}: List of tools/functions to enable
+     *   <li>{@code host}: Target Ollama host address
+     *   <li>{@code model}: Name of Ollama model to use
+     *   <li>{@code customPrompt}: Agent's custom base prompt
+     *   <li>{@code requestTimeoutSeconds}: Timeout for requests
+     * </ul>
+     */
+    @Data
+    public static class AgentSpec {
+        private String name;
+        private String description;
+        private List<AgentToolSpec> tools;
+        private String host;
+        private String model;
+        private String customPrompt;
+        private int requestTimeoutSeconds;
+    }
+
+    /**
+     * Subclass extension of {@link Tools.ToolSpec}, which allows associating a tool with a function
+     * implementation (via FQCN).
+     */
+    @Data
+    @Setter
+    @Getter
+    @EqualsAndHashCode(callSuper = false)
+    private static class AgentToolSpec extends Tools.ToolSpec {
+        /**
+         * Fully qualified class name of the tool's {@link ToolFunction} implementation
+         */
+        private String toolFunctionFQCN = null;
+
+        /**
+         * Instance of the {@link ToolFunction} to invoke
+         */
+        private ToolFunction toolFunctionInstance = null;
+    }
+
+    /**
+     * Bean for describing a tool function parameter for use in agent YAML definitions.
+     */
+    @Data
+    public class AgentToolParameter {
+        /**
+         * The parameter's type (e.g., string, number, etc.)
+         */
+        private String type;
+
+        /**
+         * Description of the parameter
+         */
+        private String description;
+
+        /**
+         * Whether this parameter is required
+         */
+        private boolean required;
+
+        /**
+         * Enum values (if any) that this parameter may take; _enum used because 'enum' is reserved
+         */
+        private List<String> _enum; // `enum` is a reserved keyword, so use _enum or similar
+    }
+}
--- a/src/main/java/io/github/ollama4j/exceptions/OllamaBaseException.java
+++ b/src/main/java/io/github/ollama4j/exceptions/OllamaBaseException.java
@@ -1,8 +0,0 @@
-package io.github.ollama4j.exceptions;
-
-public class OllamaBaseException extends Exception {
-
-    public OllamaBaseException(String s) {
-        super(s);
-    }
-}
--- a/src/main/java/io/github/ollama4j/exceptions/OllamaException.java
+++ b/src/main/java/io/github/ollama4j/exceptions/OllamaException.java
@@ -0,0 +1,20 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
+package io.github.ollama4j.exceptions;
+
+public class OllamaException extends Exception {
+
+    public OllamaException(String message) {
+        super(message);
+    }
+
+    public OllamaException(String message, Exception exception) {
+        super(message, exception);
+    }
+}
--- a/src/main/java/io/github/ollama4j/exceptions/RoleNotFoundException.java
+++ b/src/main/java/io/github/ollama4j/exceptions/RoleNotFoundException.java
@@ -0,0 +1,16 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
+package io.github.ollama4j.exceptions;
+
+public class RoleNotFoundException extends Exception {
+
+    public RoleNotFoundException(String s) {
+        super(s);
+    }
+}
--- a/src/main/java/io/github/ollama4j/exceptions/ToolInvocationException.java
+++ b/src/main/java/io/github/ollama4j/exceptions/ToolInvocationException.java
@@ -1,7 +1,19 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
 package io.github.ollama4j.exceptions;

 public class ToolInvocationException extends Exception {

+    public ToolInvocationException(String s) {
+        super(s);
+    }
+
    public ToolInvocationException(String s, Exception e) {
        super(s, e);
    }
--- a/src/main/java/io/github/ollama4j/exceptions/ToolNotFoundException.java
+++ b/src/main/java/io/github/ollama4j/exceptions/ToolNotFoundException.java
@@ -1,3 +1,11 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
 package io.github.ollama4j.exceptions;

 public class ToolNotFoundException extends Exception {
--- a/src/main/java/io/github/ollama4j/impl/ConsoleOutputChatTokenHandler.java
+++ b/src/main/java/io/github/ollama4j/impl/ConsoleOutputChatTokenHandler.java
@@ -0,0 +1,18 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
+package io.github.ollama4j.impl;
+
+import io.github.ollama4j.models.chat.OllamaChatStreamObserver;
+
+public final class ConsoleOutputChatTokenHandler extends OllamaChatStreamObserver {
+    public ConsoleOutputChatTokenHandler() {
+        setThinkingStreamHandler(new ConsoleOutputGenerateTokenHandler());
+        setResponseStreamHandler(new ConsoleOutputGenerateTokenHandler());
+    }
+}
--- a/src/main/java/io/github/ollama4j/impl/ConsoleOutputGenerateTokenHandler.java
+++ b/src/main/java/io/github/ollama4j/impl/ConsoleOutputGenerateTokenHandler.java
@@ -0,0 +1,18 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
+package io.github.ollama4j.impl;
+
+import io.github.ollama4j.models.generate.OllamaGenerateTokenHandler;
+
+public class ConsoleOutputGenerateTokenHandler implements OllamaGenerateTokenHandler {
+    @Override
+    public void accept(String message) {
+        System.out.print(message);
+    }
+}
--- a/src/main/java/io/github/ollama4j/impl/ConsoleOutputStreamHandler.java
+++ b/src/main/java/io/github/ollama4j/impl/ConsoleOutputStreamHandler.java
@@ -1,14 +0,0 @@
-package io.github.ollama4j.impl;
-
-import io.github.ollama4j.models.generate.OllamaStreamHandler;
-
-public class ConsoleOutputStreamHandler implements OllamaStreamHandler {
-    private final StringBuffer response = new StringBuffer();
-
-    @Override
-    public void accept(String message) {
-        String substr = message.substring(response.length());
-        response.append(substr);
-        System.out.print(substr);
-    }
-}
--- a/src/main/java/io/github/ollama4j/metrics/MetricsRecorder.java
+++ b/src/main/java/io/github/ollama4j/metrics/MetricsRecorder.java
@@ -0,0 +1,129 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
+package io.github.ollama4j.metrics;
+
+import com.google.common.base.Throwables;
+import io.prometheus.client.Counter;
+import io.prometheus.client.Histogram;
+import java.util.Map;
+
+public class MetricsRecorder {
+
+    // Corrected: Removed duplicate "format" label and ensured label count matches usage
+    private static final Counter requests =
+            Counter.build()
+                    .name("ollama_api_requests_total")
+                    .help("Total requests to Ollama API")
+                    .labelNames(
+                            "endpoint",
+                            "model",
+                            "raw",
+                            "streaming",
+                            "thinking",
+                            "http_status",
+                            "options",
+                            "format")
+                    .register();
+
+    private static final Histogram requestLatency =
+            Histogram.build()
+                    .name("ollama_api_request_duration_seconds")
+                    .help("Request latency in seconds")
+                    .labelNames(
+                            "endpoint",
+                            "model",
+                            "raw",
+                            "streaming",
+                            "thinking",
+                            "http_status",
+                            "options",
+                            "format")
+                    .register();
+
+    private static final Histogram responseSize =
+            Histogram.build()
+                    .name("ollama_api_response_size_bytes")
+                    .help("Response size in bytes")
+                    .labelNames("endpoint", "model", "options")
+                    .register();
+
+    public static void record(
+            String endpoint,
+            String model,
+            boolean raw,
+            boolean thinking,
+            boolean streaming,
+            Map<String, Object> options,
+            Object format,
+            long startTime,
+            int responseHttpStatus,
+            Object response) {
+        long endTime = System.currentTimeMillis();
+
+        String httpStatus = String.valueOf(responseHttpStatus);
+
+        String formatString = "";
+        if (format instanceof String) {
+            formatString = (String) format;
+        } else if (format instanceof Map) {
+            formatString = mapToString((Map<String, Object>) format);
+        } else if (format != null) {
+            formatString = format.toString();
+        }
+
+        // Ensure the number of labels matches the labelNames above (8 labels)
+        requests.labels(
+                        endpoint,
+                        safe(model),
+                        String.valueOf(raw),
+                        String.valueOf(streaming),
+                        String.valueOf(thinking),
+                        httpStatus,
+                        safe(mapToString(options)),
+                        safe(formatString))
+                .inc();
+        double durationSeconds = (endTime - startTime) / 1000.0;
+
+        // Ensure the number of labels matches the labelNames above (8 labels)
+        requestLatency
+                .labels(
+                        endpoint,
+                        safe(model),
+                        String.valueOf(raw),
+                        String.valueOf(streaming),
+                        String.valueOf(thinking),
+                        httpStatus,
+                        safe(mapToString(options)),
+                        safe(formatString))
+                .observe(durationSeconds);
+
+        // Record response size (only if response is a string or json-like object)
+        if (response != null) {
+            if (response instanceof Exception) {
+                response = Throwables.getStackTraceAsString((Throwable) response);
+            }
+            int size = response.toString().length();
+            responseSize.labels(endpoint, safe(model), safe(mapToString(options))).observe(size);
+        }
+    }
+
+    // Utility method to convert options Map to string (you can adjust this for more detailed
+    // representation)
+    private static String mapToString(Map<String, Object> map) {
+        if (map == null || map.isEmpty()) {
+            return "none";
+        }
+        // Convert the map to a string (can be customized to fit the use case)
+        return map.toString();
+    }
+
+    private static String safe(String value) {
+        return (value == null || value.isEmpty()) ? "none" : value;
+    }
+}
--- a/src/main/java/io/github/ollama4j/models/chat/OllamaChatMessage.java
+++ b/src/main/java/io/github/ollama4j/models/chat/OllamaChatMessage.java
@@ -1,45 +1,57 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
 package io.github.ollama4j.models.chat;

 import static io.github.ollama4j.utils.Utils.getObjectMapper;

+import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.core.JsonProcessingException;
 import com.fasterxml.jackson.databind.annotation.JsonSerialize;
-
 import io.github.ollama4j.utils.FileToBase64Serializer;
-
 import java.util.List;
-import lombok.AllArgsConstructor;
-import lombok.Data;
-import lombok.NoArgsConstructor;
-import lombok.NonNull;
-import lombok.RequiredArgsConstructor;
+import lombok.*;

 /**
 * Defines a single Message to be used inside a chat request against the ollama /api/chat endpoint.
 *
- * @see <a href="https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion">Generate chat completion</a>
+ * @see <a
+ *     href="https://github.com/ollama/ollama/blob/main/docs/api.md#generate-a-chat-completion">Generate
+ *     chat completion</a>
 */
+@SuppressWarnings("NullableProblems")
@Data
@AllArgsConstructor
@RequiredArgsConstructor
@NoArgsConstructor
+@JsonIgnoreProperties(ignoreUnknown = true)
 public class OllamaChatMessage {

-    @NonNull
-    private OllamaChatMessageRole role;
+    @NonNull private OllamaChatMessageRole role;

+    @JsonProperty("content")
    @NonNull
-    private String content;
+    private String response;
+
+    private String thinking;
+
+    private @JsonProperty("tool_calls") List<OllamaChatToolCalls> toolCalls;

    @JsonSerialize(using = FileToBase64Serializer.class)
    private List<byte[]> images;
-    
-      @Override
-  public String toString() {
-    try {
-      return getObjectMapper().writerWithDefaultPrettyPrinter().writeValueAsString(this);
-    } catch (JsonProcessingException e) {
-      throw new RuntimeException(e);
+
+    @Override
+    public String toString() {
+        try {
+            return getObjectMapper().writerWithDefaultPrettyPrinter().writeValueAsString(this);
+        } catch (JsonProcessingException e) {
+            throw new RuntimeException(e);
+        }
    }
-  }
 }
--- a/src/main/java/io/github/ollama4j/models/chat/OllamaChatMessageRole.java
+++ b/src/main/java/io/github/ollama4j/models/chat/OllamaChatMessageRole.java
@@ -1,19 +1,57 @@
+/*
+ * Ollama4j - Java library for interacting with Ollama server.
+ * Copyright (c) 2025 Amith Koujalgi and contributors.
+ *
+ * Licensed under the MIT License (the "License");
+ * you may not use this file except in compliance with the License.
+ *
+*/
 package io.github.ollama4j.models.chat;

 import com.fasterxml.jackson.annotation.JsonValue;
+import io.github.ollama4j.exceptions.RoleNotFoundException;
+import java.util.ArrayList;
+import java.util.List;
+import lombok.Getter;

 /**
 * Defines the possible Chat Message roles.
 */
-public enum OllamaChatMessageRole {
-    SYSTEM("system"),
-    USER("user"),
-    ASSISTANT("assistant");
+@Getter
+public class OllamaChatMessageRole {
+    private static final List<OllamaChatMessageRole> roles = new ArrayList<>();

-    @JsonValue
-    private String roleName;
+    public static final OllamaChatMessageRole SYSTEM = new OllamaChatMessageRole("system");
+    public static final OllamaChatMessageRole USER = new OllamaChatMessageRole("user");
+    public static final OllamaChatMessageRole ASSISTANT = new OllamaChatMessageRole("assistant");
+    public static final OllamaChatMessageRole TOOL = new OllamaChatMessageRole("tool");

-    private OllamaChatMessageRole(String roleName){
+    @JsonValue private final String roleName;
+
+    private OllamaChatMessageRole(String roleName) {
        this.roleName = roleName;
+        roles.add(this);
+    }
+
+    public static OllamaChatMessageRole newCustomRole(String roleName) {
+        return new OllamaChatMessageRole(roleName);
+    }
+
+    public static List<OllamaChatMessageRole> getRoles() {
+        return new ArrayList<>(roles);
+    }
+
+    public static OllamaChatMessageRole getRole(String roleName) throws RoleNotFoundException {
+        for (OllamaChatMessageRole role : roles) {
+            if (role.roleName.equals(roleName)) {
+                return role;
+            }
+        }
+        throw new RoleNotFoundException("Invalid role name: " + roleName);
+    }
+
+    @Override
+    public String toString() {
+        return roleName;
    }
 }
--- a/Show More
+++ b/Show More
				`@@ -0,0 +1 @@`
				`<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24"><path fill="gray" d="M12 8a3 3 0 0 0 3-3 3 3 0 0 0-3-3 3 3 0 0 0-3 3 3 3 0 0 0 3 3m0 3.54C9.64 9.35 6.5 8 3 8v11c3.5 0 6.64 1.35 9 3.54 2.36-2.19 5.5-3.54 9-3.54V8c-3.5 0-6.64 1.35-9 3.54z"></path></svg>`
				`@@ -0,0 +1 @@`
				<svg xmlns="http://www.w3.org/2000/svg" shape-rendering="geometricPrecision" text-rendering="geometricPrecision" image-rendering="optimizeQuality" fill-rule="evenodd" clip-rule="evenodd" viewBox="0 0 512 448.012"><path fill="#424C57" fill-rule="nonzero" d="M386.912 404.082c-31.653 27.804-68.134 42.5-104.037 43.83-37.327 1.383-73.956-11.673-103.851-39.445l-15.148-14.072-54.139-44.075h-48.07c-25.406 0-43.633-20.936-46.214-46.212L.195 154.71c-2.581-25.279 20.879-48.258 46.214-46.215l189.276 15.258c25.335 2.042 46.214 20.793 46.214 46.215v134.14c0 25.418-20.805 46.212-46.214 46.212h-59.323l4.221 47.53 4.398 2.497c33.226 18.862 65.806 26.928 96.721 22.849 28.258-3.73 55.317-17.66 80.412-42.828l-15.449-13.603 64.278-14.309-7.532 67.307-16.499-15.681z"/><path fill="#337EC6" d="M276.315 73.91l113.118-9.119-30.176-17.127c-33.228-18.859-65.809-26.927-96.721-22.848-28.26 3.73-55.319 17.66-80.414 42.827l15.449 13.604-64.277 14.309 7.533-67.307 16.498 15.682C188.979 16.128 225.46 1.43 261.363.099c37.327-1.382 73.956 11.673 103.851 39.446l26.942 25.027 73.434-5.92c25.335-2.042 48.796 20.936 46.215 46.216l-15.258 149.397c-2.581 25.277-20.808 46.212-46.215 46.212h-50.099l-71.815 58.466 5.19-58.466h-57.293c-25.411 0-46.216-20.793-46.216-46.212V120.126c0-25.422 20.881-44.173 46.216-46.216z"/><path fill="#fff" d="M84.246 173.714h43.841v-12.353h19.585v12.353h44.001v20.281h-9.033c-1.629 12.811-5.21 23.368-11.297 34.746-5.149 9.611-11.796 18.87-19.516 27.744 12.32 14.735 27.616 27.554 45.826 39.24l-10.073 15.701c-19.119-12.267-35.356-25.774-48.699-41.332-11.936 11.547-25.496 22.303-39.7 32.127l-10.588-15.325c14.055-9.725 27.407-20.347 38.896-31.672-10.588-15.493-18.697-32.88-24.308-52.875l17.968-5.038c4.601 16.401 11.052 30.837 19.344 43.759 5.657-6.848 10.548-13.876 14.412-21.092 4.74-8.866 7.286-16.196 8.89-25.983H84.246v-20.281zM398.456 215.314h-55.931l-7.357 25.09h-27.814c11.988-31.729 25.831-69.829 37.825-101.573 4.318-11.466 9.234-30.467 24.943-30.467 16.284 0 21.691 17.409 26.222 29.436l38.4 103.122h-28.708l-7.58-25.608zm-6-24.398l-21.893-57.703-21.971 57.703h43.864z"/></svg>