January 8th, 2026
New

Sometimes, things break.
Filesystems get grumpy.
Bootloaders go missing.
And occasionally⦠you just want full, raw access to your disks without your OS getting in the way.
So weβve just launched a brand-new feature in the Webdock Rescue Console:
π Reboot your VPS straight into SystemRescue β with a single click.
And yes β this feature exists because one of you asked for it just a couple of days ago. We realized it was easy to implement, genuinely useful, and honestlyβ¦ a lot of fun to build. So we did. π
Open the Rescue Console for your server
At the top of the screen, click βReboot into Rescue Systemβ
Confirm β and keep the console window open
Thatβs it. Your server will reboot and boot directly into the SystemRescue ISO.
It may take a minute, so please be patient while it loads.
Youβll see this message when confirming:
Reboot into Rescue System
You are about to reboot your server into SystemRescue (system-rescue.org).
Simply keep this console window open and you will see your system boot into the Rescue System ISO. This may take a minute, so please be patient.To revert to booting from your normal OS, simply use the restart button in the Webdock Dashboard.
If you issue arebootcommand inside the Rescue System, your server will simply boot back into the Rescue System.Networking should come up and be functional in SystemRescue once booted, but if you have connection issues please see the instructions in our documentation.
β οΈ Warning: Making changes to your system β especially disk partitions β is fraught with danger. You should probably create a snapshot of your server before proceeding.
SystemRescue is a powerful, battle-tested Linux rescue environment, giving you full control over your server β even if your OS wonβt boot.
Here are just some of the things you can do:
Repair broken filesystems (fsck, btrfs-progs, xfs_repair)
Mount disks manually and recover critical data
Resize or inspect partitions
Fix corrupted UUIDs or missing mounts
Reinstall or repair GRUB
Fix broken /etc/fstab
Chroot into your installed OS for deep debugging
Recover from failed kernel upgrades
Reset root or user passwords
Fix broken SSH configurations
Recover access if you locked yourself out
Clone disks or partitions
Perform offline backups
Inspect SMART data
Zero, wipe, or securely erase disks
In short: if your OS canβt help you, SystemRescue can.
SystemRescue gives you full, unrestricted access to your disks.
This is incredibly powerful β and potentially destructive.
πΈ Always create a snapshot before making changes, especially when:
Editing partitions
Repairing filesystems
Reinstalling bootloaders
Networking should come up automatically once SystemRescue has booted.
If you do experience connectivity issues, please check our docs for assistance:
π https://webdock.io/en/docs/webdock-control-panel/getting-started/rescue-console
Use the Restart button in the Webdock Dashboard to boot normally again
β οΈ If you run reboot inside SystemRescue, your server will simply boot back into SystemRescue
SystemRescue gives you immense power β and with that comes risk.
Disk and partition changes are irreversible.
πΈ We strongly recommend creating a snapshot before you begin.
As always:
You ask β we build β you break things safely π
Happy rescuing,
The Webdock Team π
December 10th, 2025

We recently launched our new in-house AI chat system, and while our newsletter announcement focuses on the user-facing experience, this post goes all-in on the technical foundation. If you're curious about how a small team can design a fast, sustainable, and highly reliable RAG-powered AI assistant with full control, this is your deep dive.
Want to check it out? Try it here
Letβs start from the top and walk through the entire pipeline.
Our AI system is built as a modular, multi-stage pipeline optimized for speed, determinism, and Webdock-specific accuracy.
The flow looks like this:

This architecture gives us:
predictable latency
extremely high context relevance
minimal hallucination rate
full transparency and debuggability
Weβre not relying on a monolithic LLM to βfigure everything out.β Instead, every stage has a clear responsibility and failure domain.
Most RAG pipelines skip this step. We do not.
Users write messy things. They ramble. They include multiple questions in one sentence. They mix languages in the same sentence. They mention emotional context, or reference βthat thing earlierβ without clarity.
Before we do anything, the user message is fed into a lightweight Mistral-Nemo model running locally. Its job:
Clarify ambiguous phrasings
Remove filler words
Convert casual speech into structured intent
Extract the core problem statement
Produce a version of the question optimized for vector search
Translate the query to english with high accuracy, even if mixed languages are used
This is not rewriting for the LLM β it is rewriting specifically to improve retrieval accuracy while normalizing the query to english.
Typical example:
User:
βHey, so I rebooted and now MariaDB doesn't start and the VPS seems weird??β
Normalized version:
βMariaDB won't start after VPS rebootβ
This improves retrieval precision dramatically, especially across large or similarly-worded documents.
Latency for this step: ~400 ms.

We chose bge-m3 after benchmarking several embedding models for:
semantic quality
multilingual capability
robustness to noise
cosine similarity distribution consistency
performance on small hardware
It consistently produced the best retrieval results for Webdockβs knowledge domain.
Every piece of Webdock documentation β website pages, KB articles, product pages, FAQ entries, migration guides, pricing descriptions, API docs β is continuously transformed into clean, structured Markdown objects.
Each Markdown block is summarized into:
A short βsemantic headerβ
A long-form chunk
Metadata tags
Canonical source URL
Timestamped summary
This gives us a search index that always matches reality, even when we update the docs.
Knowledge Base β Markdown β Summaries β Embeddings β FAISS Index
----------------------------------------------------------------
[Docs] --> [Markdown] --> [Chunking] --> [Summaries] --> [bge-m3 Embedding]
| | | |
| | | +--> [Vector Store]
| | |
| | +-----------------------> [Metadata Index]
| |
+----------------------------------> [Continuous Updater (Cronjob)]Emb β embed with bge-m3 (1024-dim vectors)
cosine similarity search using an optimized FAISS pipeline
approximate kNN tuned for sub-millisecond distance calculations
Retrieval time: ~300 ms for the entire operation, including:
embedding
vector search
top-k filtering
deduplication
relevance weighting
chunk aggregation
This is extremely fast for a full RAG pipeline.
Once the relevant chunks are found, we build the payload for the LLM.
our engineered system prompt
the normalized query
the top relevant Markdown chunks
the userβs last 3 messages
the LLMβs most recent answer
We call this our micro-conversation memory.
It avoids long-context bloat while still supporting:
conversations about troubleshooting
multi-turn clarification
follow-up questions
refinement loops
We do not store or log this memory beyond the active session β it is purely local context.
Context assembly time: ~20β40 ms.
Context Assembly Payload Composition
System Prompt | ββββββββββββββββββββ 35%
RAG Chunks | βββββββββββββββββββββββββββ 45%
User Message History (3 turns) | ββββββββ 15%
Assistant Last Reply | ββ 5%We run two independent Qwen 14B models via Ollama on GPUs. Why two?
improved throughput
better concurrency
more predictable latency
simple load balancing
Each Qwen instance is pinned to:
40 dedicated CPU cores (for tokenization + inference scheduling)
two A100 GPU tiles
With two independent pipelines, even if one instance receives a heavy prompt, the other keeps the system responsive.
After extensive testing against other 3Bβ70B models, Qwen 14B hit the sweet spot:
excellent reasoning
strong multilingual capability
robust adherence to structured prompts
low hallucination rate
outstanding performance per GPU watt
fits comfortably on 2x A100 16GB VRAM
With our optimized prompt and RAG setup:
First Token Latency: ~3 seconds (when warm, the occasional cold startup can create +8 second latency here - we are working on eliminating that)
Streaming Speed: ~35β50 tokens/sec (varies by context size)
This is more than enough for support-grade responsiveness.
stable
predictable model loading
minimal overhead
zero dependency hell
efficient VRAM usage
trivial multi-instance support
It lets us keep everything reproducible and transparent.
We use a simple, elegant round-robin router instead of a stateful queue.
Because the two LLM instances are truly independent, this lets us:
evenly distribute workload
avoid queue pile-ups under sudden load spikes
serve 10β12 simultaneous requests with comfortable latency
- But even if we hit those limits, we built a queue system which informs the user that they are next in line to be served :)
scale horizontally by simply adding more model instances
This architecture is trivially scalable.
If we want:
4 Qwen instances?
or 8?
on multiple GPU servers?
β¦we can do that without rearchitecting the system.

Our entire AI system runs on refurbished enterprise hardware:
NVIDIA A100 16GB PCIe cards
older generation, extremely affordable on the refurb market
far from βobsoleteβ in real-world inference workloads
A100 16GB still excels at:
medium-size LLMs (14Bβ30B)
multimodel pipelines
fast embedding generation
high concurrency inference
Because the models are so efficient, we need only four GPUs to serve our typical load with plenty of headroom.
Refurbished Hardware = Less e-waste
100% Green Electricity (Denmark)
Zero cloud dependence
On-prem inference β no data shipped externally, 100% GDPR compliance
Extremely low operational power draw
This gives us a uniquely eco-friendly and privacy-oriented AI architecture.
Because we control the frontend entirely, we built features that SaaS chat solutions canβt offer:

Smooth ChatGPT-style streaming
Animated typing indicator
Session history and reset controls
Suggested follow-up actions generated automatically
Human support handover button inside the chat
instantly switches to real support when needed
UI theme integrated with Webdock brand
Fine-grained analytics without compromising privacy
The entire frontend is loaded via a lightweight iframe overlay, allowing us to embed it anywhere on webdock.io or the control panel.
Our system prompt enforces:
strict product scope
RAG-only factual grounding
competitor exclusion rules
escalation logic
URL constraints
multilingual replies
structured, modular response blocks
safety rules & fallback behaviors
The system prompt is the βconstitutionβ of the AI.
It ensures:
π§ predictable behavior
π« zero hallucinated services
π clear, structured answers
π no drift into topics we do not support
π relatable and friendly Webdock tone
The prompt was refined through hundreds of test cases, and we continuously improve it by monitoring real user interactions.

After several weeks of testing and now a few days of real traffic:
Query Normalization | ββββββββββββββββββββββββββ 400 ms
Embedding + RAG Search | ββββββββββββββββ 300 ms
Context Assembly | ββ 35 ms
LLM First Token (warm) | ββββββββββββββββββββββββββββββββββββββββββββββββββ 3000 ms
Tokens/Second | 35β50 tokens/sec (streaming)Stable across thousands of user prompts.
~20β40% per 2x GPU when processing a typical single prompt β leaving plenty of headroom.
Three reasons:
We cannot rely on a general-purpose AI model βhopingβ it knows Webdockβs offerings.
We need deterministic grounding.
Our LLM is extremely fast: Latency to first token from submitting your prompt to the answer streaming back to you being around 3 seconds is amazing compared to most 3rd party services.
Running on refurbished hardware is dramatically cheaper than cloud LLM APIs at scale. All we pay for is our 100% green electricity, and our chat AI uses about 500 watts total on average, or about 12Kw/day. At average Denmark electricity costs, we are spending about 1.6 Euro/24 hours running our stack, where we could in theory handle something like 15-20 thousand queries per 24 hours - not that our load is anywhere approaching those numbers :)
Given the typical bill from our 3rd party provider we used up until this point, which used OpenAI models, we are already saving ~80% on our monthly inference costs and have a long way to go before our costs will ever increase (given the high volume we can handle already). We are no longer paying per-token or per-conversation - instead we just have to look at overall throughput / watt and how many GPUs we have available in-house.
This calculation does not take into account depreciation cost for the hardware we sourced, but we were lucky to get our hands on a large-ish stack of Enterprise Dell 4x A100 16GB machines for very cheap, so we are not really worrying about that.
Customer queries never leave our datacenter. 100% GDPR Compliance.
Weβre just getting started. What we are working on in 2026:
Conversational billing explanations
Proactive troubleshooting suggestions
Embeddings and RAG for internal support tooling
Auto-generated configuration snippets
User-specific contextual help in the dashboard
Multi-agent pipeline for pre-sales + technical assistance + lifecycle management
Our current infrastructure is flexible enough to support years of expansion, and we already have the hardware on hand to build and run most of these up-coming workloads.
Webdockβs new AI assistant is the result of an end-to-end engineering effort involving:
model tuning
careful RAG architecture
GPU optimization
environmental sustainability
frontend development
prompt engineering
concurrency control
and deep integration into our existing documentation workflows
π©οΈ Itβs fast.
π― Itβs accurate.
π± Itβs green.
ποΈ Itβs ours β built by Webdock, for Webdock customers.
Thank you for reading all the nerdy details! :)
Arni Johannesson, CEO Webdock
November 20th, 2025
Improved

Weβre excited to unveil version 2.0 of the native Webdock Control Panel app β completely redesigned to give you a faster, smoother, and more powerful way to manage your servers on the go.
Built from the ground up with an all-new user experience, this update brings a modern, intuitive interface along with a range of highly requested features that put full control of your Webdock environment right in your pocket.
Available now for iOS and Android.
Stay informed with push notifications for important server events. Later this year we will update the notification center with system updates, and account-related alerts β so you're always in the loop. More on that laterβ¦π
/

Need help? Chat directly with our support team from within the app. Get quick answers and real-time assistance whenever you need it.

Spin up new servers in just a few taps. Whether you're launching a project or scaling up, you can now deploy new instances right from your phone.

Get the latest Webdock news, updates, and feature releases delivered straight to your app β so you never miss out on what's new.
Enjoy a sleek new dark mode thatβs easier on the eyes, perfect for late-night sessions or those who prefer a more subdued interface.

Webdock App v2.0 is more than just a fresh coat of paint β itβs a complete overhaul designed to make your VPS server management experience effortless, responsive, and enjoyable.
September 30th, 2025
New

Weβre excited to introduce a powerful new tool now available in your Webdock dashboard β the Rescue Console.

With Rescue Console, you can connect directly to your VPS β completely bypassing the network stack. This gives you:
Direct console access to your server, just like plugging in a monitor and keyboard.
A reliable fallback when Web SSH isnβt responding.
Rescue and recovery options for firewall lockouts, broken configurations, or network issues.
The Rescue Console is designed to give you greater control, confidence, and peace of mind when managing your servers β especially during those critical moments when every second counts. Instead of waiting on support or being blocked by networking issues, you always have a direct line into your VPS.

β οΈ Good to know: Youβll still need a working shell user login to access your server. In most cases, adding a new shell user through the Webdock dashboard will let you log in through the Rescue Console if you forgot your password or donβt have a shell user set up yet.
With this feature, weβre closing the gap between affordable VPS hosting and enterprise-grade recovery tools β putting more power back in your hands.
We have written a more detailed article in our documentation if you want to read further on how this works.
August 6th, 2025
New

The Webdock No-Nonsense Load Balancer is a powerful, flat-rate solution designed to distribute traffic evenly across multiple servers β ensuring high availability, faster performance, and automatic failover. Priced at just β¬9.95 per domain/month with no hidden fees, it includes SSL offloading, HTTP/3 support, edge caching, and live traffic stats β all managed through the Webdock dashboard. Hosted in Denmark with full GDPR compliance, it offers sustainable, EU-based infrastructure powered by renewable energy. Whether routing traffic to Webdock VPS or external servers, itβs a scalable, secure, and contract-free way to keep your web applications online and performing at their best.
Flat-Fee Pricing
At β¬9.95 / domain per month, there are no bandwidth or request limits, and no hidden overage charges.
Traffic Distribution & Redundancy
Ensures even distribution of incoming traffic across multiple servers, reducing overload, latency, and downtime. Automatic failover reroutes traffic if one server becomes unavailable.
Performance Enhancements
Offers SSL offloading, HTTP/3 support, edge caching of static and dynamic content, and can reduce server load by up to 25%.
Security & EU Hosting & Sustainability
Hosted in Webdockβs Denmark data centre (DKβDC1), with GDPR compliance and multi-layer protection including Voxility DDoS scrubbing. Infrastructure runs on renewable energy.
Integrated & Simple Setup
Easily created and managed via the Webdock control panel. Includes automated Letβs Encrypt SSL certs, DNS configuration instructions, and live traffic stats. Fits alongside other Webdock services in a unified dashboard.
Scalability & Flexibility
Designed to adapt to traffic spikes transparently. Can route traffic to both Webdock VPS and external servers with full control over hardware and IPsβno vendor lock-in, no contracts.
Fixed, predictable monthly cost per domain.
Improved performance and uptime with effortless scaling.
Seamless integration with Webdock's existing VPS ecosystem.
European-hosted, privacy-conscious, and energy-efficient infrastructure.
August 6th, 2025
New

Introducing the Webdock No-Nonsense Web Application Firewall β a powerful, EU-hosted security solution designed to protect your websites from bots, scrapers, DDoS, and other malicious attacks. Built on the advanced Blackwall engine, this WAF offers real-time protection without the need for complex setup or manual rule management. With flat-rate pricing, seamless integration into your Webdock dashboard, and GDPR-compliant hosting in Denmark, itβs the ideal choice for developers and businesses who want reliable, hassle-free website security. Activate per domain, customize as needed, and enjoy faster load times with less server strain β all without vendor lock-in or hidden costs.
Advanced Protection: Blocks bots, scrapers, SQL injections, XSS, Layer-7 (L7) DDoS attacks using the BotGuard/Blackwall engineβno manual rule setup required.
Effortless Management: Activate protection per domain via the Webdock dashboard. DNS configuration and SSL certificate handling via Letβs Encrypt are included.
Flat-Rate Pricing: β¬11.95 per domain/month, with no bandwidth limits, overage charges, or hidden fees.
Datacenter in Denmark (DKβDC1): Fully owned and operated by Webdock, ensuring data remains within the EU under strict EU law.
No Vendor Lock-in: Full freedom to use your stack or switch providers at any time, with no contracts.
Performance Optimisation: Mitigates malicious traffic to reduce server load and improve response times.
Unified Dashboard Experience: Manage WAF alongside SSL, caching, load balancers, and VPS within Webdockβs control panel.
Scalable Cloud Architecture: Automatically handles traffic spikes and scales with your domain needs.
Regular Updates: Constant updates of threat signatures and security logic ensure protection against the latest attacks.
Customizable Rules: Tailor firewall behavior per domain for more precise security control.
Effective protection against bots, DDoS, scrapers, and automated attacks.
Low server load, faster page delivery.
Simplified pricing model with predictable monthly costs.
Full EU data control thanks to a Danish data centre.
Integrated with Webdockβs ecosystemβeasy setup, unified management.
Activate via Dashboard: Add your domain, then follow DNS configuration instructions.
System Setup: WAF filters traffic, issues SSL, and enables edge caching automatically.
Monitor & Customize: Access detailed traffic metrics and tweak rules as needed.
Unified Billing: View charges alongside other Webdock servicesβall within the same account.
In summary, Webdockβs NoβNonsense WAF delivers high-performance, EUβbased web application firewall functionality with transparent pricing, minimal maintenance, and tight integration into the Webdock ecosystem.
August 5th, 2025
New

We are happy to announce our latest project: Webdock CLI.
This is a command line tool available for MacOS, Windows and Linux which enables you to easily script and interact with the Webdock API. You don't need to know the ins and outs of REST APIs and payloads anymore: Simply install the Webdock CLI and use it in your scripts - or play around with the interactive mode for that truly nerdy terminal experience π€
Check out the project on our GitHub: https://github.com/webdock-io/webdock-cli
To get started, simply grab an API key from your Webdock dashboard, install the CLI and run "webdock init"
Have fun building cool stuff!
June 23rd, 2025
New

Refer Webdock to a friend and your friend will earn a 20%discount. There are no limits. Whatever your friends purchase within 3 months you receive 20% immediately from all their purchases in that time period.
It works like this:
You send an invite Send your friends a referral link that they can use when purchasing a server.
They make a purchase You receive 20% as Account Credit within 24 hours. You will be notified by email.
You receive your credits Your earnings can be used to pay your current or future subscriptions.
You receive earnings for 3 months For a full 3 months you receive a commission for any purchases your friend makes in that period.
Get your unique link here when you are a customer.
Any service purchased at Webdock will count towards your reward credit - but only server purchases will be discounted for your friend.
As soon as your friend makes a purchase for any amount, you immediately receive your reward credit to your account.
May 6th, 2025
New
Improved

We are thrilled to announce a set of resources designed to provide effortless transition solutions for migrating to Webdock, alongside enhanced attention to detail within our platform!
You can get the comparison overview here.
Discover our collection of migration instructions to assist you in moving from your current system to Webdock seamlessly:
Access these guides directly through our support documentation and enjoy a hassle-free migration process.
Resolved a textual error in the message displayed during new server creation, ensuring clearer communication with our users.
We are dedicated to continuously improving your experience with Webdock. Thank you for choosing our services!
May 6th, 2025
Improved

Webdock has always had fast provisioning times, but with the rollout of our new infrastructure in Denmark and after all the changes we had done, we were not happy with provisioning taking up to - in some cases - a full 4-5 minutes. Here we detail how we managed to get provisioning time down to our best-case time of 25 seconds on an Epyc instance.
When we started working on this, the very first thing we noticed was that when we wrote network configuration we were firstly shutting down the instance, doing some stuff on the host, then starting it and writing network config and lastly doing a final reboot. Doing a shutdown and reboot of instances is an expensive operation, so the first thing we looked at was whether it was possible to apply networking after the very first boot of an instance and not have to restart it twice. With some magic on the host and with cloud-init, we achieved just that.
Once we had optimized that part of the workflow, we shaved off a full minute off provisioning time - now we were down to 2-3 minutes on average. Nice!
After our launch of the Denmark DC - and rather recently in fact - we switched our Hypervisor from LXD to Incus. Incus is a truly open source (and much better maintained) fork of LXD. Despite this change, for compatibility reasons we were still using LXD images. After speaking with the guys that built our new hypervisor, Incus, we found out that the LXD images were extremely ill-optimized and included all sorts of junk that wasn't really required. This caused boot times to be slower by at least 10-15 seconds per boot.
Switching to Incus native images, which are much better optimized, saved another 20-30 seconds, getting us down to 1.5-2.5 minutes.
The next optimization we looked at was that we noticed that a lot of times images were being downloaded from our image remote and not properly cached on our hosts. An image download could take something like 30-40 seconds, so making sure images were cached on hosts reduced provisioning time even further - we were now down to 1-2 minute provisioning time.
The next thing we noticed was that whenever a new instance was spun up it was doing a restart after having completed almost the entire boot process, why? After conferring with the people who built our Hypervisor - Incus - we found out that this was due to some race conditions and cloud-init needing to initialize some stuff through templates passed through the hypervisor. We found that we could eliminate this step, ultimately getting us down to our current "record" time of about 25 seconds for a base Noble image on Epyc.
Although our current best provisioning time - a base Noble image on an Epyc instance - is about 25 seconds, your results may still vary a bit. Although images are now generally cached, sometimes our system needs to re-download an image, adding about 20-30 seconds. In addition, our LAMP/LEMP stacks are a bit slower to provision as we do a lot of additional configuration steps to set up users for ftp, mysql etc. - also, which profile and platform you provision on impacts provision time - obviously a pico4 profile on Xeon which only has 1 CPU thread to work with will provision slower than a beastly Epyc instance.
With all that said however, you should in real life - most of the time - see sub 1 minute provisioning and worst case 1-2 minute provisioning time.
We are pretty happy with that result :)
Arni Johannesson, CEO Webdock