NVIDIA Blueprint Toolkit — Technical Reference
Full technical walkthrough of the NVIDIA Blueprint Toolkit — covering the RAG Concurrent User Calculator, Multi-GPU parallelism strategy engine, WestconComstor Verified badge system, Platform Selector (the original deep-dive), and Kubernetes Deployment Guide. Sections 1–8 cover the Platform Calculator; Sections 9–12 document modules added or substantially updated since the initial release.
1. Architecture Overview
The Platform Calculator is a vanilla-JS ES-module application bundled with Vite. It is split across three source files:
| File | Responsibility |
platform-calculator.js | Data loading, sheet parsing, filter state, matching engine, GPU/NIC slot logic. The single exported entry-point is initPlatformCalculator(). |
platform-calculator-htmlbuild.js | All HTML construction. Exports renderUI(), renderResults() and clearFilters(). Never imports from platform-calculator.js to avoid circular dependencies. |
style.css | All visual styles for cards, filters, badges, collapsible sections and the summary bar. |
Four module-level state variables hold the loaded data:
let _platforms = []; // parsed platform objects
let _gpuCatalogue = []; // GPU catalogue rows
let _cpuCatalogue = []; // CPU catalogue rows
let _nicCatalogue = []; // NIC catalogue rows
let _pidToGpu = {}; // Cisco GPU PID → { model, vramGB, entry }
let _pidToNic = {}; // Cisco NIC PID → { description, type, ports, … }
let _expCageGpuByChassisId = {}; // GPU slots from type=chnodeexpcage rows, keyed by intoChassisId
2. Data Sources
All data is fetched from Google Sheets via the gviz/tq JSON endpoint. The helper fetchGvizSheet(url) strips the google.visualization.Query.setResponse(…) wrapper, parses the JSON table, and returns { headers, dataRows, fRows } — where fRows carries the formatted (.f) cell string used when the numeric .v value is unreliable (e.g., CPU core counts stored as Excel date serials).
| Constant | Sheet (internal name) | Contents |
SHEET_URL | NVBPTK_PLF | Platform catalogue. One row per server model. Columns cover: identity, CPU config, memory maximums per CPU-qty/gen, drive bay quantities and sizes, PCIe riser slot specs, GPU choices per riser variant, NIC choices per riser variant. |
GPU_SHEET_URL | NVBPTK_GPU | GPU catalogue. Columns: GPU model, manufacturer, category, VRAM (raw string), interconnect type. |
CPU_SHEET_URL | NVBPTK_CPU | CPU catalogue. Columns: model number, manufacturer, generation, class, core count (formatted string), clock speeds, TDP, memory type/speed, socket, SPECint score, release date. |
NIC_SHEET_URL | NVBPTK_NIC | NIC catalogue. Columns: part number (Cisco PID), description, type, port count, speed (Gb), ethernet/IB flags, medium, connector. |
3. Initialisation Flow
initPlatformCalculator() is called once from main.js on page load. Its steps:
- Injects a loading message into
#plat-calc-wrap.
- Calls
Promise.all([loadPlatforms(), loadGpuCatalogue(), loadCpuCatalogue(), loadNicCatalogue()]) — all four sheets are fetched in parallel.
- Calls
buildPidToGpuMap() — scans every riser slot in every platform (plus chassis-node Rear Mezz slots, MGPU-server GPU bay slots, and expansion cage slots) for GPU PIDs and their embedded display names, cross-references the GPU catalogue, and populates _pidToGpu.
- Calls
buildPidToNicMap() — same scan for NIC PIDs, populates _pidToNic.
- After all platforms are parsed,
parseExpansionCageRows() scans all PLF rows (not just enumerate=Yes) for type=chnodeexpcage rows, reads their ifchassisnode-RearMezz-gpuChoice, and stores the GPU slots keyed by ifnode-intoChassisId. Each chassis-node platform then has its expansionCageGpu array linked by matching intoChassisId.
- Calls
renderUI(…) to build the filter panel and placeholder.
Riser Group Definitions (RISER_GROUPS)
Three PCIe riser groups are defined as a static constant. Each group maps to a set of mutually-exclusive variants (the platform can be ordered with any one variant per riser):
{ id:'r1', label:'Riser 1', requiresCPU2:false,
variants:[{id:'1A', gpuCol:'ifnode-pcie-riser1-gpuChoice-1A', nicCol:'…-nicChoice-1A'}, …] }
{ id:'r2', label:'Riser 2', requiresCPU2:true, variants:[…2A, 2B, 2C] }
{ id:'r3', label:'Riser 3', requiresCPU2:true, variants:[…3A, 3B, 3C, 3D] }
requiresCPU2: true means the riser is physically connected to CPU socket 2 and is only populated when ≥ 2 CPUs are installed. This flag is enforced in all slot-count calculations.
4. Platform Parsing (parsePlatform)
Called for every data row in the PLF sheet. Returns null (row is skipped) if enumerate ≠ "Yes". Otherwise builds a rich platform object. Key parsed fields:
Identity & CPU
| Object field | Sheet column | Notes |
id, name | id, name | Display identity |
maxCPUQty | ifnode-maxCPUQty | Hard maximum CPU sockets |
cpuCombinations | ifnode-cpuCombinations | Valid CPU counts, e.g. "1,2" → [1,2]. Parsed by parseCpuCombinations() which splits on whitespace/commas. |
cpuMfg, cpuGenA, cpuGenB | ifnode-cpuMfg, ifnode-cpuGenA, ifnode-cpuGenB | Manufacturer and integer generation numbers. null when the column is blank or "na". |
Memory
Five columns are read, covering all combinations of generation (A / B) and CPU count (1 / 2 / 4). Missing or "na" values are stored as 0.
mem: {
maxA1: safeF('ifnode-maxMGB-genA-cpux1'), // Gen A, 1 CPU
maxA2: safeF('ifnode-maxMGB-genA-cpux2'), // Gen A, 2 CPUs
maxA4: safeF('ifnode-maxMGB-genA-cpux4'), // Gen A, 4 CPUs
maxB1: safeF('ifnode-maxMGB-genB-cpux1'), // Gen B, 1 CPU
maxB2: safeF('ifnode-maxMGB-genB-cpux2'), // Gen B, 2 CPUs
}
Drives
Nine drive types are defined in DRIVE_DEFS (2.5″ SAS/SATA/NVMe, 3.5″ HDD, NVMe E3.S, etc.). For each type, two columns are read:
- Max-qty column — e.g.
"[FRONT] 24 [RISER 1B] 2 [RISER 3B] 2". Parsed by both parseDriveQty() (display breakdown) and parsePositionedQty() (per-position array for CPU-filtering).
- Size column — e.g.
"[FRONT] 960, 1600, … 61400 [FRONTMEZZ] 6400". Parsed by parseSizeGBPerPosition() so each position has its own allowed sizes.
Drives with no valid qty or size data are discarded (.filter(d => d.qty !== null && d.sizes.length > 0)).
GPU & NIC Riser Data
For every RISER_GROUPS variant, two sheet columns are read:
gpuCol → parseGpuChoice() — parses "SLOT1: UCSC-GPU-L4 (NVIDIA L4 (24 GB)) SLOT2: …" into an array of { slotKey, pids, pidNames }. The PID regex is UCS[A-Z]-GPU- so both UCSC-GPU- (rack server) and UCSX-GPU- (chassis node Mezz) PIDs are captured.
nicCol → parseNicChoice() — same format but with NIC PIDs.
Results are stored in p.risers[rgId][variantId] and p.nicRisers[rgId][variantId]. Boolean flags p.hasGpuData and p.hasNicData are set if any slots were found.
Special Platform Types
Three additional platform types require extra column reads beyond the standard riser slots:
| type value | Additional columns read | Effect |
chassisnode |
ifchassisnode-RearMezz-gpuChoice |
Parsed into p.chassisMezzGpu (same slot format). These GPUs are counted in getGpuConfig as a “Rear Mezz” group, always included (not CPU-count-dependent). |
mgpuserver |
ifmgpuserver-gpuChoice, ifmgpuserver-GPUtoDriveRules |
mgpuGpuChoice replaces riser slots entirely (the platform uses dedicated GPU bays). mgpuDriveRules is parsed as a { gpuQty → maxDrives } map (e.g. [2] 24 [4] 16 [8] 8), used when a drive maxqty cell contains [ISMGPUSERVER]. |
chnodeexpcage |
ifchassisnode-RearMezz-gpuChoice, ifnode-intoChassisId |
Scanned by parseExpansionCageRows() (not parsePlatform). GPU slots are stored in _expCageGpuByChassisId and linked to chassis-node platforms via p.intoChassisId matching. |
GPU Combinations (ifnode-gpuCombinations)
When this column is populated (e.g. "2, 4, 8"), the parsed array is stored as p.gpuCombinations. It constrains which GPU quantities are valid for the platform — used during GPU filter matching to ensure only valid configurations are returned.
Drives with [ISMGPUSERVER]
If a drive’s maxqty cell contains the literal token [ISMGPUSERVER], the drive is flagged isMgpuVariable: true and its qty is stored as null. At match time, the actual drive limit is read from mgpuDriveRules[chosenGpuQty].
5. Filter UI & runFilter()
renderUI() builds the filter panel (left column) and the results placeholder (right column). Every change to a filter input calls runFilter() via change events on selects and Enter keydown on text/number inputs.
runFilter() reads fourteen DOM fields:
| DOM id | Variable | Type |
plat-cpu-mfg | cpuMfg | string — CPU manufacturer filter (empty = any). Derived from ifnode-cpuMfg in PLF. |
plat-mem | memGB | float — minimum memory in GB |
plat-cpu-gen | cpuGen | float — required CPU generation number (0 = any) |
plat-cpu-qty | cpuQty | int — exact CPU count (0 = any) |
plat-drv-qty | drvQty | int — minimum drive bays |
plat-drv-size | drvSize | float — minimum drive size in GB |
plat-gpu-type | gpuPid | string — Cisco GPU PID (empty = any/none) |
plat-xseries-expansion | includeExpansion | bool — include X-Series GPU Node (UCSX-9508-D expansion cage). Default: on. |
plat-gpu-mem | gpuMemGB | float — minimum total GPU VRAM |
plat-cores | coresReq | int — total cores required across all CPUs |
plat-nic-type | nicPid | string — Cisco NIC PID (empty = any/none) |
plat-nic-qty | nicQty | int — minimum NIC card count |
plat-nic-speed | nicSpeed | float — minimum NIC speed in Gb/s |
plat-nic-ports | nicPortQty | int — minimum total NIC port count |
These are bundled into a criteria object. includeExpansion is bound into a gpuFn closure that is passed to renderResults() so all GPU config calls use the same toggle state. A criteria chip "X-Series Expansion excluded" is shown when the toggle is off.
6. Matching Logic (matchPlatform)
Each filter check returns { match: false } immediately on failure (short-circuit). The checks run in this order:
Step 1 — CPU Quantity
If cpuQty > 0, the platform's cpuCombinations array is checked. If the array is populated, the selected count must be in the list. If the array is empty (old-format rows), the check falls back to p.maxCPUQty ≥ cpuQty.
Step 2 — CPU Generation
If cpuGen > 0, the platform's cpuGenA and cpuGenB values are compared. If neither matches, the platform fails. Matching gens are collected into activeGens (['A'], ['B'], or ['A','B']). When no gen filter is set, all non-null gens are included.
Step 3 — Memory
The set of CPU counts to check against (effQties) is determined as follows:
- If
cpuQty > 0: effQties = [cpuQty] — only the selected count is checked.
- If no filter:
effQties = cpuQtyOptions (all valid counts from cpuCombinations) — the platform's highest available memory wins.
For each combination of activeGens × effQties, the lookup key is p.mem[`max${gen}${qty}`] (e.g. maxA2 for Gen A, 2 CPUs). The global maximum becomes maxMem. If maxMem < memGB the platform is rejected.
Step 4 — Effective CPU Qty for Slots
After the memory check, a single effectiveCpuQty is computed for all subsequent slot-based checks (drives, GPU, NIC):
const maxValidCpu = max(p.cpuCombinations) || p.maxCPUQty;
const effectiveCpuQty = cpuQty > 0 ? cpuQty : maxValidCpu;
This ensures that when no CPU filter is set, the full configuration (all risers populated) is shown.
Step 5 — GPU (runs before drives)
The GPU check is performed before the drive loop so that chosenGpuQty is available for [ISMGPUSERVER] drive limit calculations. Two sub-paths:
- With GPU filter (
gpuPid set) — calls getGpuConfig(p, gpuPid, effectiveCpuQty, includeExpansion). If p.gpuCombinations is populated, the raw slot count is constrained to valid combinations that fit within the slot maximum. Then the memory requirement (gpuMemGB) is applied. The largest qualifying combination becomes chosenGpuQty and drives the VRAM total. A validCombos array is stored for card display.
- Without GPU filter —
chosenGpuQty is still resolved from p.gpuCombinations or p.mgpuDriveRules so that isMgpuVariable drives can be sized correctly.
Step 6 — Drives
For each drive type in p.drives, two processing paths exist:
Standard drives (qty not null):
- CPU filtering —
getRiserMinCpu(posLabel) excludes riser positions needing more CPUs than effectiveCpuQty.
- Size filtering — if
drvSize > 0, only positions whose sizes include a value ≥ drvSize count towards effectiveQty.
- Quantity check — if
drvQty > 0 and effectiveQty < drvQty, this drive type is skipped.
- Storage totals —
maxStorageGB is the sum over available positions of qty × maxSizeForPosition.
MGPU-server variable drives (isMgpuVariable: true, maxqty cell was [ISMGPUSERVER]):
- Looks up
p.mgpuDriveRules[chosenGpuQty] to determine the maximum bay count for the chosen GPU configuration.
- When no GPU filter is set, uses the maximum drive limit across all GPU combinations.
- The card shows a per-GPU-configuration drive table (e.g. 2 GPU→ 24 bays, 4 GPU→ 16 bays, 8 GPU→ 8 bays) instead of a fixed bay count.
If any drive filter is active and driveOptions is empty, the platform fails.
Step 7 — CPU Cores
If coresReq > 0, the per-CPU threshold is derived as perCpuCoresReq = ⌈coresReq / effectiveCpuCount⌉. The platform passes if any CPU in getCpusForPlatform() has known cores ≥ threshold or unknown cores (unknown = cannot rule out). Fails only when every catalogued CPU has verified cores below the threshold.
Step 8 — NIC
Activated when any NIC criterion is non-zero. Two sub-paths:
- Specific PID — calls
getNicConfig(p, nicPid, effectiveCpuQty), checks total count ≥ nicQty and total ports ≥ nicPortQty.
- Any NIC — iterates all PIDs present in the platform's NIC risers, filters by speed ≥
nicSpeed, qty and port requirements, selects the PID that maximises total slot count.
A nicResult object is attached to the match on success.
7. GPU & NIC Slot Calculation
getGpuConfig(p, pid, effectiveCpuQty, includeExpansion)
Four processing paths, evaluated in priority order:
- MGPU-server (
p.type === 'mgpuserver') — reads p.mgpuGpuChoice slots directly. Returns immediately without checking risers.
- PCIe risers — iterates
RISER_GROUPS. Riser groups with requiresCPU2 = true are skipped when effectiveCpuQty < 2. For each eligible group, bestVariantForPid() picks the variant that maximises the slot count.
- Chassis-node Rear Mezz (
p.chassisMezzGpu) — counted as a fixed additional group, labelled “Rear Mezz”. Always included regardless of CPU count.
- X-Series Expansion cage (
p.expansionCageGpu) — only included when includeExpansion === true (controlled by the Include X Series GPU Node toggle). Breakdown entries are flagged isExpansion: true and rendered with an amber “via Expansion Node” badge.
Returns:
{ totalCount, totalVramGB, breakdown: [{ label, variantId, count, slotDetails, isExpansion? }] }
getNicConfig(p, pid, effectiveCpuQty)
Mirrors getGpuConfig exactly, reading from p.nicRisers instead of p.risers. Returns:
{ totalCount, breakdown: [{ label, variantId, count, slotDetails }] }
getRiserMinCpu(posLabel)
Used by the drive-counting logic. Strips any "RISER " prefix, reads the first digit as the riser group number, and returns the requiresCPU2 flag from RISER_GROUPS as a minimum count (1 or 2). Non-riser positions (FRONT, FRONTMEZZ, MIDPLANE) always return 1.
NIC Capability Badges
The NIC Capabilities section renders colour-coded badge rows (port count / speed / type / medium / connector) using buildNicBadges(), matching the visual style of the platform card collapsed-view badges.
8. Rendering Pipeline
renderResults(results, criteria, …)
Receives the array of passing match objects. Renders a chip row summarising active criteria, then calls buildCard(r, …) for each result and sets #plat-results innerHTML.
buildCard(r, criteria, …)
Each result card is wrapped in a <details class="plat-card-collapse"> element — collapsed by default. The <summary> contains the card header (platform name, ID, badges) plus a compact summary bar showing: Memory max · Best drive configuration · GPU result (if filter active) · NIC result (if filter active). Clicking the header or pressing the toggle arrow (▶/▼) expands the full detail body.
The full card body contains five sections, each built by a dedicated helper:
| Section | Built by | Contents |
| Memory | inline in buildCard | Max memory line with Gen/qty annotation. When a memory filter is active, the matched config is highlighted. |
| Drives | inline in buildCard | Per-drive-type rows. Standard drives show bay count, size options, max storage and per-position breakdown. MGPU-server drives show a GPU-config→drive-limit table (e.g. 2 GPU → 24 bays, 4 GPU → 16 bays) instead of a fixed count. |
| GPU | inline in buildCard | When gpuPid filter is set: hero count, VRAM total, riser breakdown. When validCombos has multiple entries, a blue note shows all valid GPU configurations (e.g. “Valid GPU configurations: 2, 4, 8 × this GPU”). Expansion-node breakdown entries show an amber “via Expansion Node” badge. When no filter: summary table of all available GPU PIDs. |
| NIC | buildNicSection() | Same dual-mode as GPU. Filter-active mode shows hero count, NIC name, speed, port summary. No-filter mode shows all NIC PIDs across all risers with colour-coded attribute badges (port count / speed / type / medium / connector). |
| CPU Models | buildCpuSection() + <details> | A per-generation table of supported CPUs with cores, speed, TDP, SPECint. Rows are highlighted green/grey/red based on cores filter. The raw model list is in a nested collapsible. |
9. Application Module Map
The toolkit has grown beyond the Platform Calculator. Every tab is a distinct ES module initialised from main.js on page load:
| Tab | Module(s) | Entry point | Description |
| Calculator | rag-calculator.js | initRagCalculator() | Concurrent user estimator for NVIDIA AI Blueprints. See §10. |
| GPUs | gpu-products.js | initGpuProducts() | GPU catalogue browser — renders cards from GPU_GROUPS. |
| Models | llm-models.js | initLlmModels() | LLM model catalogue browser — renders cards from LLM_MODEL_GROUPS. |
| Platforms | platform-calculator.js, platform-calculator-htmlbuild.js | initPlatformCalculator() | Server platform selector. Fully documented in §1–8 of this page. |
| Blueprints | reference-tables.js, blueprints.js | initReferenceTables() | Blueprint reference cards with WestconComstor Verified badges. See §11. |
| BP Components | bp-components.js | initBpComponents() | NIM component reference — cards from COMPONENTS with VRAM / GPU / CPU fields. |
| Industry / Vertical | industry-vertical.js, blueprints.js | initIndustryVertical() | Industry-tagged blueprint cards with WestconComstor Verified badges. See §11. |
| BP_Enterprise RAG | bp-detailed-enterprise-rag.js | initBpDetailedEnterpriseRag() | Deep-dive reference for the Enterprise RAG blueprint architecture. |
| Deployment | deployment.js | initDeploymentGuide() | Kubernetes & Helm deployment guide for PoC and scale-out. See §12. |
10. RAG Calculator — Concurrent User Estimation Engine
The Calculator tab estimates the maximum concurrent users a given hardware and blueprint configuration can support. It is driven by a single module-level state object and a pure calculate() function in rag-calculator.js.
Calculator State (state object)
| Field | Default | Description |
gpuId | 'H100_SXM_80' | Selected GPU model key — maps to an entry in GPU_GROUPS. |
gpuCount | 1 | Total GPU count for the deployment. |
outputTokens | 300 | Expected average output tokens per LLM response. |
thinkTime | 30 | User think time in seconds between requests. |
selectedModelId | Enterprise RAG default | LLM model ID from LLM_MODEL_GROUPS. |
selectedQuantId | null | Quantization override. null auto-selects the model's maximum quantization. |
parallelismStrategy | 'tensor' | Active multi-GPU parallelism strategy ID (used only when multiGpuEnabled is true). |
multiGpuEnabled | false | Multi-GPU strategy toggle state. When false, strategy is forced to single-GPU inference. See §10.3. |
blueprintId | Enterprise RAG | Active NVIDIA AI Blueprint — determines which NIM components are shown. |
components | required on, optional off | Object mapping component ID → { enabled, onGpu }. Optional components can be toggled off; GPU components can be switched to CPU. |
Blueprint & Component Model
Each entry in BLUEPRINTS (blueprints.js) has a required[] and optional[] array of component IDs referencing COMPONENTS in bp-components.js. When a Blueprint is selected the Calculator renders only those components; optional components can be disabled. Blueprints may carry sizingVerified: true (see §11).
The "Show WestconComstor verified only" toggle (#bp-verified-toggle-btn) filters the Blueprint dropdown to entries where sizingVerified === true. Default: off.
Multi-GPU Parallelism Strategy Toggle
A rocker switch (#multi-gpu-toggle-btn) in the Parallelism Strategy row controls whether multi-GPU strategies are enabled. Default: OFF (single GPU inference).
| Toggle state | Effective strategy | UI behaviour |
| OFF (default) |
'request' (forced internally — hidden from user) |
Strategy dropdown hidden. Row shows a "Single GPU Inference" info tooltip explaining the mode and how it compares to each multi-GPU strategy. Row is visually dimmed via .rag-strategy-row--single. |
| ON |
state.parallelismStrategy (user-selected) |
Strategy dropdown visible with all 9 options. Each strategy's tooltip shows a singleGpuVs row (vs single GPU) and a tensorVs row (vs Tensor Parallelism baseline). A red warning (.rag-strategy-row--needs-gpu) appears when the selected strategy requires more GPUs than configured. |
Auto-toggle: The GPU count input is wired so that changing from 1 to ≥ 2 automatically enables the multi-GPU toggle, and reducing back to 1 disables it — preventing impossible strategy/count combinations.
Parallelism Strategies (PARALLELISM_STRATEGIES constant)
Nine strategies are defined. Each carries a tpsEfficiency multiplier (relative to the Tensor Parallelism baseline at 1.00) and a gpuGrouping describing how the GPU pool is partitioned for calculation:
| ID | Label | TPS efficiency | GPU grouping | Primary use case |
tensor | Tensor Parallelism | 1.00 (baseline) | all GPUs = 1 instance | Large model inference (10B+ params) — splits weight tensors across GPUs |
pipeline | Pipeline Parallelism | 0.80 | linear layer chain | Very large models — splits transformer layers into pipeline stages |
hybrid | Hybrid Parallelism | 0.90 | all GPUs = 1 instance | 100B+ production deployments combining TP + PP + DP |
expert | Expert Parallelism (MoE) | 1.20 | all GPUs = 1 instance | MoE models — sparse activation means only a fraction of experts fire per token |
sequence | Sequence / Context Parallelism | 0.90 | distributed attention | 128 K+ token RAG context windows — distributes attention, not weights |
kvcache | KV Cache Parallelism | 0.85 | model GPUs + dedicated KV cache GPUs | Production RAG with many concurrent long-context sessions |
request | Request Parallelism | 0.95 | independent model replicas | High-throughput serving — near-linear user scaling per GPU group |
data | Data Parallelism | 0.70 | gradient-sync replicas | Training / fine-tuning only — not recommended for production inference |
offload | CPU / GPU Offloading | 0.15 | CPU-paged weights | VRAM-constrained environments — pages weights via PCIe (5–10× slower) |
Calculation Formula (calculate())
- Effective strategy:
effectiveStratId = state.multiGpuEnabled ? state.parallelismStrategy : 'request'.
- GPU instances: determined by
gpuGrouping — e.g. 'all' → one instance using all GPUs; 'tp' → gpuCount / minGpusPerInst independent replicas; 'kvcache' → reserves a share of GPUs for KV cache, rest for compute.
- TPS per instance: model token-per-second rate at the chosen quantization, scaled by
strat.tpsEfficiency.
- Response time:
responseTimeSec = outputTokens / tpsPerInstance.
- QPS:
qps = totalTps / outputTokens.
- Concurrent users (Little's Law):
concurrentUsers = round(qps × (thinkTime + responseTimeSec)).
The result object is returned to the rendering layer which displays a metric card, GPU VRAM allocation bar chart, and strategy breakdown panel.
11. WestconComstor Verified Badge System
Blueprints sized and validated by WestconComstor carry sizingVerified: true in their blueprints.js entry. This flag drives both a Calculator filter toggle and distinct visual treatment on the Blueprints and Industry / Vertical tabs.
Calculator Filter Toggle
The "Show WestconComstor verified only" toggle (#bp-verified-toggle-btn) in the Blueprint Selection row filters the Blueprint dropdown to verified entries only. Default: off. Label text was updated from "Show verified only" to "Show WestconComstor verified only" to make the provenance explicit.
Card Visual Treatment (Blueprints & Industry / Vertical tabs)
| Element | CSS class / selector | Visual effect |
| Card wrapper (Blueprints tab) | .ref-bp-card--verified | Gold gradient overlay (135deg, rgba(217,119,6,0.10) → transparent at 50%), amber border, overflow: hidden to clip the star watermark. |
| Card wrapper (Industry tab) | .ind-bp-card--verified | Same gradient and border treatment as Blueprints tab. |
| Star watermark (Blueprints) | .ref-bp-card--verified::before | 90 px ★ pseudo-element in rgba(217,119,6,0.18), anchored top-left. top: -12px compensates for font ascender space above the glyph (★ at 90 px has ~14 px of space above the visual character). |
| Star watermark (Industry) | .ind-bp-card--verified::before | 72 px ★, top: -9px for the smaller size. |
| Badge row | .ref-bp-verified-row / .ind-bp-verified-row | padding-left: 56px / 46px to clear the star watermark horizontally. |
| Badge chip | .ref-badge-verified | Gold chip: rgba(217,119,6,0.14) background, #92400e text, amber border, font-weight: 700. Text: "★ WestconComstor Verified". |
To mark a blueprint as verified, set sizingVerified: true in its blueprints.js entry. Currently only Enterprise RAG (enterprise_rag) is verified.
12. Deployment Guide Tab
The Deployment tab renders a complete Kubernetes & Helm deployment guide, targeting a single-server, 1× GPU proof-of-concept with Qwen3-35B-A3B as the LLM, and extending to a 2-node tensor-parallelism scale-out in Phase 9. All content is rendered in JavaScript by deployment.js → initDeploymentGuide(); there is no server-side component.
Builder Helpers
| Function | Returns | Purpose |
phase(num, title, icon, bodyHtml, open?) | Accordion HTML string | Wraps content in a .depl-phase collapsible; open by default for Phases 1–5, closed for 6–9. |
step(n, title, bodyHtml) | Numbered step HTML | Renders a numbered step block with a circle badge and a titled body. |
cmd(code, lang) | Dark code block HTML | Wraps a command in .depl-cmd-block with language label and a copy-to-clipboard button (wired by initCopyButtons()). |
note(text) / warn(text) / tip(text) | Callout box HTML | Renders ℹ / ⚠ / ★ callout boxes with the corresponding colour style. |
Guide Phases
| Phase | Title | Default | Key content |
| Arch | Architecture Overview | Open | Kubernetes service topology — all 15+ pods with ports (client → ingress → RAG core → NIMs → data layer). |
| Pre | Prerequisites Checklist | Open | Interactive checkbox list (12 items): NGC API key, hardware, OS, GPU driver, K8s, Helm, StorageClass, GPU Operator, NIM Operator, ECK Operator, NGC CLI. Checks persist in-session. |
| 1 | Storage & GPU Operator | Open | Disk audit + PVC size table; optional dedicated data disk mount at /opt/local-path-provisioner; local-path StorageClass install; PVC smoke-test; GPU Operator Helm install. |
| 2 | Operator Installation | Open | NIM Operator and ECK Operator Helm installs with verification commands. |
| 3 | Deploy Helm Chart | Open | helm upgrade --install with poc-values.yaml; deployment monitoring commands (watch, NIMCache status, events). |
| 4 | Verify Deployment | Open | Expected kubectl get pods output (~15 pods); all services port reference table. |
| 5 | Access the Web UI | Open | Port-forward commands for UI (:3000), RAG API (:8081), Ingestor (:8082). |
| 6 | Configuration | Closed | GPU time-slicing setup (ConfigMap + ClusterPolicy patch); Qwen3-35B-A3B NIM install (4-step); Milvus VDB switch; NIM cache persistence; optional GPU services table. |
| 7 | Lifecycle | Closed | helm upgrade; helm uninstall; full cleanup (NIMCache + PVC deletion + namespace removal). |
| 8 | Troubleshooting | Closed | 6-card grid: Pending pods, Init/ContainerCreating, GPU shortage, ErrImagePull, disk exhaustion, port-forward timeouts. |
| 9 | Scale-Out — TP=2 | Closed | Join 2nd GPU node (kubeadm join); GPU Operator auto-config verify; LeaderWorkerSet install; tp2-values.yaml overlay (NIM_TENSOR_PARALLEL_SIZE=2, podAntiAffinity); apply + verify multi-node operation; scale reference table (TP=1 → TP=2 → TP=4 → TP=8). |