Files
improvise/llama-server.log
Edward Langley ef79a39721 Add CSV import functionality
- Use csv crate for robust CSV parsing (handles quoted fields, empty values, \r\n)
- Extend --import command to auto-detect format by file extension (.csv or .json)
- Reuse existing ImportPipeline and analyzer for field type detection
- Categories detected automatically (string fields), measures for numeric fields
- Updated help text and welcome screen to mention CSV support

All 201 tests pass.
2026-04-01 01:32:19 -07:00

12790 lines
724 KiB
Plaintext
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

ggml_cuda_init: found 1 ROCm devices (Total VRAM: 32752 MiB):
Device 0: AMD Instinct MI100, gfx908:sramecc+:xnack- (0x908), VMM: no, Wave Size: 64, VRAM: 32752 MiB
common_download_file_single_online: no previous model file found /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file (same etag): /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf
common_download_file_single_online: using cached file (same etag): /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00002-of-00003.gguf
common_download_file_single_online: using cached file (same etag): /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00003-of-00003.gguf
build: 8414 (5744d7ec4) with GNU 14.3.0 for Linux x86_64
system info: n_threads = 16, n_threads_batch = 16, total_threads = 16
system_info: n_threads = 16 (n_threads_batch = 16) / 16 | ROCm : NO_VMM = 1 | PEER_MAX_BATCH_SIZE = 128 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX_VNNI = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | AVX512 = 1 | AVX512_VBMI = 1 | AVX512_VNNI = 1 | AVX512_BF16 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 |
Running without SSL
init: using 15 threads for HTTP server
start: binding port with default address family
main: loading model
srv load_model: loading model '/home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf'
common_init_result: fitting params to device memory, for bugs during this step try to reproduce them with -fit off, or provide --verbose logs if the bug only occurs with -fit on
llama_params_fit_impl: getting device memory data for initial parameters:
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32734 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
create_tensor: loading tensor blk.23.ffn_down_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
create_tensor: loading tensor blk.24.ffn_down_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
create_tensor: loading tensor blk.25.ffn_down_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
create_tensor: loading tensor blk.26.ffn_down_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
create_tensor: loading tensor blk.27.ffn_down_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
create_tensor: loading tensor blk.28.ffn_down_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
create_tensor: loading tensor blk.29.ffn_down_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
create_tensor: loading tensor blk.30.ffn_down_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
create_tensor: loading tensor blk.31.ffn_down_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
create_tensor: loading tensor blk.32.ffn_down_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
create_tensor: loading tensor blk.33.ffn_down_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
create_tensor: loading tensor blk.34.ffn_down_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
create_tensor: loading tensor blk.35.ffn_down_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
create_tensor: loading tensor blk.36.ffn_down_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
create_tensor: loading tensor blk.37.ffn_down_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
create_tensor: loading tensor blk.38.ffn_down_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
create_tensor: loading tensor blk.39.ffn_down_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
create_tensor: loading tensor blk.40.ffn_down_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
create_tensor: loading tensor blk.41.ffn_down_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
create_tensor: loading tensor blk.42.ffn_down_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
create_tensor: loading tensor blk.43.ffn_down_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
create_tensor: loading tensor blk.44.ffn_down_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
create_tensor: loading tensor blk.45.ffn_down_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
create_tensor: loading tensor blk.46.ffn_down_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
create_tensor: loading tensor blk.47.ffn_down_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 0 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 420.01 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 2
sched_reserve: reserve took 9.38 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + (57572 = 54004 + 3147 + 420) + 17592185987085 |
llama_memory_breakdown_print: | - Host | 468 = 204 + 0 + 264 |
llama_params_fit_impl: projected to use 57572 MiB of device memory vs. 32510 MiB of free device memory
llama_params_fit_impl: cannot meet free memory target of 1024 MiB, need to reduce device memory by 26086 MiB
llama_params_fit_impl: context size set by user to 131072 -> no change
llama_params_fit_impl: getting device memory data with all MoE tensors moved to system memory:
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
tensor blk.0.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.0.ffn_down_exps.weight
tensor blk.0.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
tensor blk.0.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
tensor blk.1.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.1.ffn_down_exps.weight
tensor blk.1.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
tensor blk.1.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
tensor blk.2.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.2.ffn_down_exps.weight
tensor blk.2.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
tensor blk.2.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
tensor blk.3.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.3.ffn_down_exps.weight
tensor blk.3.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
tensor blk.3.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
tensor blk.4.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.4.ffn_down_exps.weight
tensor blk.4.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
tensor blk.4.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
tensor blk.5.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.5.ffn_down_exps.weight
tensor blk.5.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
tensor blk.5.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
tensor blk.6.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.6.ffn_down_exps.weight
tensor blk.6.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
tensor blk.6.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
tensor blk.7.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.7.ffn_down_exps.weight
tensor blk.7.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
tensor blk.7.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
tensor blk.8.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.8.ffn_down_exps.weight
tensor blk.8.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
tensor blk.8.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
tensor blk.9.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.9.ffn_down_exps.weight
tensor blk.9.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
tensor blk.9.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
tensor blk.10.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.10.ffn_down_exps.weight
tensor blk.10.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
tensor blk.10.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
tensor blk.11.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.11.ffn_down_exps.weight
tensor blk.11.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
tensor blk.11.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
tensor blk.12.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.12.ffn_down_exps.weight
tensor blk.12.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
tensor blk.12.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
tensor blk.13.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.13.ffn_down_exps.weight
tensor blk.13.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
tensor blk.13.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
tensor blk.14.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.14.ffn_down_exps.weight
tensor blk.14.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
tensor blk.14.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
tensor blk.15.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.15.ffn_down_exps.weight
tensor blk.15.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
tensor blk.15.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
tensor blk.16.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.16.ffn_down_exps.weight
tensor blk.16.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
tensor blk.16.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
tensor blk.17.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.17.ffn_down_exps.weight
tensor blk.17.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
tensor blk.17.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
tensor blk.18.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.18.ffn_down_exps.weight
tensor blk.18.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
tensor blk.18.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
tensor blk.19.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.19.ffn_down_exps.weight
tensor blk.19.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
tensor blk.19.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
tensor blk.20.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.20.ffn_down_exps.weight
tensor blk.20.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
tensor blk.20.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
tensor blk.21.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.21.ffn_down_exps.weight
tensor blk.21.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
tensor blk.21.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
tensor blk.22.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.22.ffn_down_exps.weight
tensor blk.22.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
tensor blk.22.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
tensor blk.23.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_exps.weight
tensor blk.23.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
tensor blk.23.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
tensor blk.24.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_down_exps.weight
tensor blk.24.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
tensor blk.24.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
tensor blk.25.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_down_exps.weight
tensor blk.25.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
tensor blk.25.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
tensor blk.26.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_down_exps.weight
tensor blk.26.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
tensor blk.26.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
tensor blk.27.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_down_exps.weight
tensor blk.27.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
tensor blk.27.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
tensor blk.28.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_down_exps.weight
tensor blk.28.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
tensor blk.28.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
tensor blk.29.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_down_exps.weight
tensor blk.29.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
tensor blk.29.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
tensor blk.30.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_down_exps.weight
tensor blk.30.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
tensor blk.30.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
tensor blk.31.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_down_exps.weight
tensor blk.31.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
tensor blk.31.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
tensor blk.32.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_down_exps.weight
tensor blk.32.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
tensor blk.32.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
tensor blk.33.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_down_exps.weight
tensor blk.33.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
tensor blk.33.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
tensor blk.34.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_down_exps.weight
tensor blk.34.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
tensor blk.34.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
tensor blk.35.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_down_exps.weight
tensor blk.35.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
tensor blk.35.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
tensor blk.36.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_down_exps.weight
tensor blk.36.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
tensor blk.36.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
tensor blk.37.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_down_exps.weight
tensor blk.37.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
tensor blk.37.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
tensor blk.38.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_down_exps.weight
tensor blk.38.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
tensor blk.38.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
tensor blk.39.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_down_exps.weight
tensor blk.39.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
tensor blk.39.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
tensor blk.40.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_down_exps.weight
tensor blk.40.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
tensor blk.40.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
tensor blk.41.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_down_exps.weight
tensor blk.41.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
tensor blk.41.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
tensor blk.42.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_down_exps.weight
tensor blk.42.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
tensor blk.42.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
tensor blk.43.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_down_exps.weight
tensor blk.43.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
tensor blk.43.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
tensor blk.44.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_down_exps.weight
tensor blk.44.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
tensor blk.44.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
tensor blk.45.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_down_exps.weight
tensor blk.45.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
tensor blk.45.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
tensor blk.46.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_down_exps.weight
tensor blk.46.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
tensor blk.46.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
tensor blk.47.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_down_exps.weight
tensor blk.47.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
tensor blk.47.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 144 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
load_tensors: ROCm_Host model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 736.00 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 146 (with bs=512), 98 (with bs=1)
sched_reserve: reserve took 8.36 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + ( 5568 = 1684 + 3147 + 736) + 17592186039089 |
llama_memory_breakdown_print: | - Host | 52788 = 52524 + 0 + 264 |
llama_params_fit_impl: with only dense weights in device memory there is a total surplus of 25917 MiB
llama_params_fit_impl: id=0, target=31486 MiB
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device CPU, is_swa = 0
load_tensors: layer 1 assigned to device CPU, is_swa = 0
load_tensors: layer 2 assigned to device CPU, is_swa = 0
load_tensors: layer 3 assigned to device CPU, is_swa = 0
load_tensors: layer 4 assigned to device CPU, is_swa = 0
load_tensors: layer 5 assigned to device CPU, is_swa = 0
load_tensors: layer 6 assigned to device CPU, is_swa = 0
load_tensors: layer 7 assigned to device CPU, is_swa = 0
load_tensors: layer 8 assigned to device CPU, is_swa = 0
load_tensors: layer 9 assigned to device CPU, is_swa = 0
load_tensors: layer 10 assigned to device CPU, is_swa = 0
load_tensors: layer 11 assigned to device CPU, is_swa = 0
load_tensors: layer 12 assigned to device CPU, is_swa = 0
load_tensors: layer 13 assigned to device CPU, is_swa = 0
load_tensors: layer 14 assigned to device CPU, is_swa = 0
load_tensors: layer 15 assigned to device CPU, is_swa = 0
load_tensors: layer 16 assigned to device CPU, is_swa = 0
load_tensors: layer 17 assigned to device CPU, is_swa = 0
load_tensors: layer 18 assigned to device CPU, is_swa = 0
load_tensors: layer 19 assigned to device CPU, is_swa = 0
load_tensors: layer 20 assigned to device CPU, is_swa = 0
load_tensors: layer 21 assigned to device CPU, is_swa = 0
load_tensors: layer 22 assigned to device CPU, is_swa = 0
load_tensors: layer 23 assigned to device CPU, is_swa = 0
load_tensors: layer 24 assigned to device CPU, is_swa = 0
load_tensors: layer 25 assigned to device CPU, is_swa = 0
load_tensors: layer 26 assigned to device CPU, is_swa = 0
load_tensors: layer 27 assigned to device CPU, is_swa = 0
load_tensors: layer 28 assigned to device CPU, is_swa = 0
load_tensors: layer 29 assigned to device CPU, is_swa = 0
load_tensors: layer 30 assigned to device CPU, is_swa = 0
load_tensors: layer 31 assigned to device CPU, is_swa = 0
load_tensors: layer 32 assigned to device CPU, is_swa = 0
load_tensors: layer 33 assigned to device CPU, is_swa = 0
load_tensors: layer 34 assigned to device CPU, is_swa = 0
load_tensors: layer 35 assigned to device CPU, is_swa = 0
load_tensors: layer 36 assigned to device CPU, is_swa = 0
load_tensors: layer 37 assigned to device CPU, is_swa = 0
load_tensors: layer 38 assigned to device CPU, is_swa = 0
load_tensors: layer 39 assigned to device CPU, is_swa = 0
load_tensors: layer 40 assigned to device CPU, is_swa = 0
load_tensors: layer 41 assigned to device CPU, is_swa = 0
load_tensors: layer 42 assigned to device CPU, is_swa = 0
load_tensors: layer 43 assigned to device CPU, is_swa = 0
load_tensors: layer 44 assigned to device CPU, is_swa = 0
load_tensors: layer 45 assigned to device CPU, is_swa = 0
load_tensors: layer 46 assigned to device CPU, is_swa = 0
load_tensors: layer 47 assigned to device CPU, is_swa = 0
load_tensors: layer 48 assigned to device CPU, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
create_tensor: loading tensor blk.23.ffn_down_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
create_tensor: loading tensor blk.24.ffn_down_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
create_tensor: loading tensor blk.25.ffn_down_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
create_tensor: loading tensor blk.26.ffn_down_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
create_tensor: loading tensor blk.27.ffn_down_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
create_tensor: loading tensor blk.28.ffn_down_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
create_tensor: loading tensor blk.29.ffn_down_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
create_tensor: loading tensor blk.30.ffn_down_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
create_tensor: loading tensor blk.31.ffn_down_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
create_tensor: loading tensor blk.32.ffn_down_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
create_tensor: loading tensor blk.33.ffn_down_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
create_tensor: loading tensor blk.34.ffn_down_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
create_tensor: loading tensor blk.35.ffn_down_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
create_tensor: loading tensor blk.36.ffn_down_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
create_tensor: loading tensor blk.37.ffn_down_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
create_tensor: loading tensor blk.38.ffn_down_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
create_tensor: loading tensor blk.39.ffn_down_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
create_tensor: loading tensor blk.40.ffn_down_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
create_tensor: loading tensor blk.41.ffn_down_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
create_tensor: loading tensor blk.42.ffn_down_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
create_tensor: loading tensor blk.43.ffn_down_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
create_tensor: loading tensor blk.44.ffn_down_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
create_tensor: loading tensor blk.45.ffn_down_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
create_tensor: loading tensor blk.46.ffn_down_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
create_tensor: loading tensor blk.47.ffn_down_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 0 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading 0 repeating layers to GPU
load_tensors: offloaded 0/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm_Host model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: CPU output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = CPU
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = CPU
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = CPU
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = CPU
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = CPU
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = CPU
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = CPU
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = CPU
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = CPU
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = CPU
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = CPU
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = CPU
llama_kv_cache: CPU KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = CPU
llama_memory_recurrent, layer 1: dev = CPU
llama_memory_recurrent, layer 2: dev = CPU
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = CPU
llama_memory_recurrent, layer 5: dev = CPU
llama_memory_recurrent, layer 6: dev = CPU
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = CPU
llama_memory_recurrent, layer 9: dev = CPU
llama_memory_recurrent, layer 10: dev = CPU
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = CPU
llama_memory_recurrent, layer 13: dev = CPU
llama_memory_recurrent, layer 14: dev = CPU
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = CPU
llama_memory_recurrent, layer 17: dev = CPU
llama_memory_recurrent, layer 18: dev = CPU
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = CPU
llama_memory_recurrent, layer 21: dev = CPU
llama_memory_recurrent, layer 22: dev = CPU
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = CPU
llama_memory_recurrent, layer 25: dev = CPU
llama_memory_recurrent, layer 26: dev = CPU
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = CPU
llama_memory_recurrent, layer 29: dev = CPU
llama_memory_recurrent, layer 30: dev = CPU
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = CPU
llama_memory_recurrent, layer 33: dev = CPU
llama_memory_recurrent, layer 34: dev = CPU
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = CPU
llama_memory_recurrent, layer 37: dev = CPU
llama_memory_recurrent, layer 38: dev = CPU
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = CPU
llama_memory_recurrent, layer 41: dev = CPU
llama_memory_recurrent, layer 42: dev = CPU
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = CPU
llama_memory_recurrent, layer 45: dev = CPU
llama_memory_recurrent, layer 46: dev = CPU
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: CPU RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 1099.00 MiB
sched_reserve: ROCm_Host compute buffer size = 276.11 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 976 (with bs=512), 73 (with bs=1)
sched_reserve: reserve took 6.97 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32586 + ( 1099 = 0 + 0 + 1099) + 17592186043483 |
llama_memory_breakdown_print: | - Host | 57632 = 54208 + 3147 + 276 |
llama_params_fit_impl: memory for test allocation by device:
llama_params_fit_impl: id=0, n_layer= 0, n_part= 0, overflow_type=4, mem= 1099 MiB
llama_params_fit_impl: filling dense-only layers back-to-front:
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
tensor blk.1.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.1.ffn_down_exps.weight
tensor blk.1.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
tensor blk.1.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
tensor blk.2.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.2.ffn_down_exps.weight
tensor blk.2.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
tensor blk.2.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
tensor blk.3.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.3.ffn_down_exps.weight
tensor blk.3.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
tensor blk.3.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
tensor blk.4.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.4.ffn_down_exps.weight
tensor blk.4.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
tensor blk.4.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
tensor blk.5.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.5.ffn_down_exps.weight
tensor blk.5.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
tensor blk.5.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
tensor blk.6.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.6.ffn_down_exps.weight
tensor blk.6.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
tensor blk.6.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
tensor blk.7.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.7.ffn_down_exps.weight
tensor blk.7.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
tensor blk.7.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
tensor blk.8.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.8.ffn_down_exps.weight
tensor blk.8.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
tensor blk.8.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
tensor blk.9.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.9.ffn_down_exps.weight
tensor blk.9.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
tensor blk.9.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
tensor blk.10.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.10.ffn_down_exps.weight
tensor blk.10.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
tensor blk.10.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
tensor blk.11.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.11.ffn_down_exps.weight
tensor blk.11.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
tensor blk.11.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
tensor blk.12.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.12.ffn_down_exps.weight
tensor blk.12.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
tensor blk.12.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
tensor blk.13.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.13.ffn_down_exps.weight
tensor blk.13.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
tensor blk.13.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
tensor blk.14.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.14.ffn_down_exps.weight
tensor blk.14.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
tensor blk.14.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
tensor blk.15.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.15.ffn_down_exps.weight
tensor blk.15.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
tensor blk.15.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
tensor blk.16.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.16.ffn_down_exps.weight
tensor blk.16.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
tensor blk.16.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
tensor blk.17.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.17.ffn_down_exps.weight
tensor blk.17.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
tensor blk.17.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
tensor blk.18.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.18.ffn_down_exps.weight
tensor blk.18.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
tensor blk.18.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
tensor blk.19.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.19.ffn_down_exps.weight
tensor blk.19.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
tensor blk.19.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
tensor blk.20.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.20.ffn_down_exps.weight
tensor blk.20.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
tensor blk.20.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
tensor blk.21.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.21.ffn_down_exps.weight
tensor blk.21.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
tensor blk.21.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
tensor blk.22.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.22.ffn_down_exps.weight
tensor blk.22.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
tensor blk.22.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
tensor blk.23.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_exps.weight
tensor blk.23.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
tensor blk.23.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
tensor blk.24.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_down_exps.weight
tensor blk.24.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
tensor blk.24.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
tensor blk.25.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_down_exps.weight
tensor blk.25.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
tensor blk.25.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
tensor blk.26.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_down_exps.weight
tensor blk.26.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
tensor blk.26.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
tensor blk.27.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_down_exps.weight
tensor blk.27.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
tensor blk.27.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
tensor blk.28.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_down_exps.weight
tensor blk.28.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
tensor blk.28.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
tensor blk.29.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_down_exps.weight
tensor blk.29.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
tensor blk.29.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
tensor blk.30.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_down_exps.weight
tensor blk.30.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
tensor blk.30.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
tensor blk.31.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_down_exps.weight
tensor blk.31.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
tensor blk.31.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
tensor blk.32.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_down_exps.weight
tensor blk.32.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
tensor blk.32.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
tensor blk.33.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_down_exps.weight
tensor blk.33.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
tensor blk.33.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
tensor blk.34.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_down_exps.weight
tensor blk.34.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
tensor blk.34.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
tensor blk.35.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_down_exps.weight
tensor blk.35.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
tensor blk.35.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
tensor blk.36.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_down_exps.weight
tensor blk.36.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
tensor blk.36.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
tensor blk.37.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_down_exps.weight
tensor blk.37.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
tensor blk.37.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
tensor blk.38.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_down_exps.weight
tensor blk.38.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
tensor blk.38.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
tensor blk.39.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_down_exps.weight
tensor blk.39.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
tensor blk.39.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
tensor blk.40.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_down_exps.weight
tensor blk.40.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
tensor blk.40.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
tensor blk.41.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_down_exps.weight
tensor blk.41.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
tensor blk.41.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
tensor blk.42.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_down_exps.weight
tensor blk.42.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
tensor blk.42.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
tensor blk.43.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_down_exps.weight
tensor blk.43.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
tensor blk.43.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
tensor blk.44.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_down_exps.weight
tensor blk.44.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
tensor blk.44.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
tensor blk.45.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_down_exps.weight
tensor blk.45.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
tensor blk.45.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
tensor blk.46.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_down_exps.weight
tensor blk.46.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
tensor blk.46.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
tensor blk.47.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_down_exps.weight
tensor blk.47.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
tensor blk.47.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 141 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
load_tensors: ROCm_Host model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 736.00 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 143 (with bs=512), 96 (with bs=1)
sched_reserve: reserve took 5.99 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + ( 6692 = 2808 + 3147 + 736) + 17592186037965 |
llama_memory_breakdown_print: | - Host | 51664 = 51400 + 0 + 264 |
llama_params_fit_impl: memory for test allocation by device:
llama_params_fit_impl: id=0, n_layer=49, n_part=48, overflow_type=4, mem= 6692 MiB
llama_params_fit_impl: set ngl_per_device[0].n_layer=49
llama_params_fit_impl: - ROCm0 (AMD Instinct MI100): 49 layers, 6692 MiB used, 25817 MiB free
llama_params_fit_impl: converting dense-only layers to full layers and filling them front-to-back with overflow to next device/system memory:
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
create_tensor: loading tensor blk.23.ffn_down_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
create_tensor: loading tensor blk.24.ffn_down_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
create_tensor: loading tensor blk.25.ffn_down_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
create_tensor: loading tensor blk.26.ffn_down_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
create_tensor: loading tensor blk.27.ffn_down_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
create_tensor: loading tensor blk.28.ffn_down_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
create_tensor: loading tensor blk.29.ffn_down_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
create_tensor: loading tensor blk.30.ffn_down_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
create_tensor: loading tensor blk.31.ffn_down_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
create_tensor: loading tensor blk.32.ffn_down_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
create_tensor: loading tensor blk.33.ffn_down_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
create_tensor: loading tensor blk.34.ffn_down_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
create_tensor: loading tensor blk.35.ffn_down_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
create_tensor: loading tensor blk.36.ffn_down_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
create_tensor: loading tensor blk.37.ffn_down_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
create_tensor: loading tensor blk.38.ffn_down_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
create_tensor: loading tensor blk.39.ffn_down_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
create_tensor: loading tensor blk.40.ffn_down_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
create_tensor: loading tensor blk.41.ffn_down_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
create_tensor: loading tensor blk.42.ffn_down_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
create_tensor: loading tensor blk.43.ffn_down_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
create_tensor: loading tensor blk.44.ffn_down_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
create_tensor: loading tensor blk.45.ffn_down_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
create_tensor: loading tensor blk.46.ffn_down_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
create_tensor: loading tensor blk.47.ffn_down_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 0 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 420.01 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 2
sched_reserve: reserve took 5.72 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + (57572 = 54004 + 3147 + 420) + 17592185987085 |
llama_memory_breakdown_print: | - Host | 468 = 204 + 0 + 264 |
llama_params_fit_impl: memory for test allocation by device:
llama_params_fit_impl: id=0, n_layer=49, n_part= 0, overflow_type=4, mem= 57572 MiB
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
create_tensor: loading tensor blk.23.ffn_down_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
tensor blk.24.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_down_exps.weight
tensor blk.24.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
tensor blk.24.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
tensor blk.25.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_down_exps.weight
tensor blk.25.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
tensor blk.25.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
tensor blk.26.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_down_exps.weight
tensor blk.26.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
tensor blk.26.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
tensor blk.27.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_down_exps.weight
tensor blk.27.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
tensor blk.27.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
tensor blk.28.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_down_exps.weight
tensor blk.28.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
tensor blk.28.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
tensor blk.29.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_down_exps.weight
tensor blk.29.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
tensor blk.29.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
tensor blk.30.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_down_exps.weight
tensor blk.30.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
tensor blk.30.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
tensor blk.31.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_down_exps.weight
tensor blk.31.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
tensor blk.31.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
tensor blk.32.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_down_exps.weight
tensor blk.32.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
tensor blk.32.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
tensor blk.33.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_down_exps.weight
tensor blk.33.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
tensor blk.33.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
tensor blk.34.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_down_exps.weight
tensor blk.34.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
tensor blk.34.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
tensor blk.35.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_down_exps.weight
tensor blk.35.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
tensor blk.35.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
tensor blk.36.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_down_exps.weight
tensor blk.36.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
tensor blk.36.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
tensor blk.37.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_down_exps.weight
tensor blk.37.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
tensor blk.37.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
tensor blk.38.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_down_exps.weight
tensor blk.38.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
tensor blk.38.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
tensor blk.39.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_down_exps.weight
tensor blk.39.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
tensor blk.39.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
tensor blk.40.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_down_exps.weight
tensor blk.40.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
tensor blk.40.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
tensor blk.41.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_down_exps.weight
tensor blk.41.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
tensor blk.41.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
tensor blk.42.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_down_exps.weight
tensor blk.42.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
tensor blk.42.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
tensor blk.43.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_down_exps.weight
tensor blk.43.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
tensor blk.43.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
tensor blk.44.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_down_exps.weight
tensor blk.44.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
tensor blk.44.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
tensor blk.45.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_down_exps.weight
tensor blk.45.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
tensor blk.45.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
tensor blk.46.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_down_exps.weight
tensor blk.46.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
tensor blk.46.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
tensor blk.47.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_down_exps.weight
tensor blk.47.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
tensor blk.47.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 72 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
load_tensors: ROCm_Host model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 840.01 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 74 (with bs=512), 50 (with bs=1)
sched_reserve: reserve took 5.86 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + (31832 = 27844 + 3147 + 840) + 17592186012825 |
llama_memory_breakdown_print: | - Host | 26628 = 26364 + 0 + 264 |
llama_params_fit_impl: memory for test allocation by device:
llama_params_fit_impl: id=0, n_layer=49, n_part=25, overflow_type=4, mem= 31832 MiB
llama_params_fit_impl: set ngl_per_device_high[0].(n_layer, n_part)=(49, 25), id_dense_start_high=0
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
tensor blk.23.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_exps.weight
tensor blk.23.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
tensor blk.23.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
tensor blk.24.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_down_exps.weight
tensor blk.24.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
tensor blk.24.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
tensor blk.25.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_down_exps.weight
tensor blk.25.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
tensor blk.25.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
tensor blk.26.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_down_exps.weight
tensor blk.26.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
tensor blk.26.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
tensor blk.27.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_down_exps.weight
tensor blk.27.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
tensor blk.27.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
tensor blk.28.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_down_exps.weight
tensor blk.28.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
tensor blk.28.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
tensor blk.29.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_down_exps.weight
tensor blk.29.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
tensor blk.29.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
tensor blk.30.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_down_exps.weight
tensor blk.30.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
tensor blk.30.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
tensor blk.31.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_down_exps.weight
tensor blk.31.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
tensor blk.31.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
tensor blk.32.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_down_exps.weight
tensor blk.32.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
tensor blk.32.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
tensor blk.33.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_down_exps.weight
tensor blk.33.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
tensor blk.33.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
tensor blk.34.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_down_exps.weight
tensor blk.34.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
tensor blk.34.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
tensor blk.35.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_down_exps.weight
tensor blk.35.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
tensor blk.35.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
tensor blk.36.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_down_exps.weight
tensor blk.36.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
tensor blk.36.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
tensor blk.37.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_down_exps.weight
tensor blk.37.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
tensor blk.37.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
tensor blk.38.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_down_exps.weight
tensor blk.38.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
tensor blk.38.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
tensor blk.39.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_down_exps.weight
tensor blk.39.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
tensor blk.39.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
tensor blk.40.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_down_exps.weight
tensor blk.40.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
tensor blk.40.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
tensor blk.41.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_down_exps.weight
tensor blk.41.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
tensor blk.41.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
tensor blk.42.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_down_exps.weight
tensor blk.42.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
tensor blk.42.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
tensor blk.43.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_down_exps.weight
tensor blk.43.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
tensor blk.43.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
tensor blk.44.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_down_exps.weight
tensor blk.44.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
tensor blk.44.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
tensor blk.45.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_down_exps.weight
tensor blk.45.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
tensor blk.45.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
tensor blk.46.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_down_exps.weight
tensor blk.46.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
tensor blk.46.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
tensor blk.47.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_down_exps.weight
tensor blk.47.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
tensor blk.47.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 75 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
load_tensors: ROCm_Host model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 840.01 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 77 (with bs=512), 52 (with bs=1)
sched_reserve: reserve took 6.19 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + (30708 = 26720 + 3147 + 840) + 17592186013949 |
llama_memory_breakdown_print: | - Host | 27752 = 27488 + 0 + 264 |
llama_params_fit_impl: memory for test allocation by device:
llama_params_fit_impl: id=0, n_layer=49, n_part=26, overflow_type=4, mem= 30708 MiB
llama_params_fit_impl: set ngl_per_device[0].(n_layer, n_part)=(49, 26), id_dense_start=0
llama_params_fit_impl: trying to fit one extra layer with overflow_type=LAYER_FRACTION_UP
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
tensor blk.23.ffn_gate_inp.weight (4 MiB f32) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
tensor blk.23.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_exps.weight
tensor blk.23.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
create_tensor: loading tensor blk.23.ffn_up_exps.weight
tensor blk.23.ffn_gate_inp_shexp.weight (0 MiB f32) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
tensor blk.23.ffn_gate_shexp.weight (0 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
tensor blk.23.ffn_down_shexp.weight (0 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
tensor blk.24.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_down_exps.weight
tensor blk.24.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
tensor blk.24.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
tensor blk.25.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_down_exps.weight
tensor blk.25.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
tensor blk.25.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
tensor blk.26.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_down_exps.weight
tensor blk.26.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
tensor blk.26.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
tensor blk.27.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_down_exps.weight
tensor blk.27.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
tensor blk.27.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
tensor blk.28.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_down_exps.weight
tensor blk.28.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
tensor blk.28.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
tensor blk.29.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_down_exps.weight
tensor blk.29.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
tensor blk.29.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
tensor blk.30.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_down_exps.weight
tensor blk.30.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
tensor blk.30.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
tensor blk.31.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_down_exps.weight
tensor blk.31.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
tensor blk.31.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
tensor blk.32.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_down_exps.weight
tensor blk.32.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
tensor blk.32.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
tensor blk.33.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_down_exps.weight
tensor blk.33.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
tensor blk.33.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
tensor blk.34.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_down_exps.weight
tensor blk.34.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
tensor blk.34.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
tensor blk.35.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_down_exps.weight
tensor blk.35.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
tensor blk.35.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
tensor blk.36.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_down_exps.weight
tensor blk.36.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
tensor blk.36.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
tensor blk.37.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_down_exps.weight
tensor blk.37.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
tensor blk.37.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
tensor blk.38.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_down_exps.weight
tensor blk.38.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
tensor blk.38.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
tensor blk.39.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_down_exps.weight
tensor blk.39.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
tensor blk.39.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
tensor blk.40.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_down_exps.weight
tensor blk.40.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
tensor blk.40.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
tensor blk.41.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_down_exps.weight
tensor blk.41.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
tensor blk.41.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
tensor blk.42.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_down_exps.weight
tensor blk.42.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
tensor blk.42.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
tensor blk.43.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_down_exps.weight
tensor blk.43.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
tensor blk.43.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
tensor blk.44.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_down_exps.weight
tensor blk.44.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
tensor blk.44.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
tensor blk.45.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_down_exps.weight
tensor blk.45.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
tensor blk.45.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
tensor blk.46.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_down_exps.weight
tensor blk.46.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
tensor blk.46.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
tensor blk.47.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_down_exps.weight
tensor blk.47.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
tensor blk.47.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 78 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
load_tensors: ROCm_Host model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 840.01 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 80 (with bs=512), 56 (with bs=1)
sched_reserve: reserve took 5.88 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + (31054 = 27066 + 3147 + 840) + 17592186013603 |
llama_memory_breakdown_print: | - Host | 27405 = 27141 + 0 + 264 |
llama_params_fit_impl: memory for test allocation by device:
llama_params_fit_impl: id=0, n_layer=49, n_part=26, overflow_type=2, mem= 31054 MiB
llama_params_fit_impl: set ngl_per_device[0].(n_layer, n_part, overflow_type)=(49, 26, UP), id_dense_start=0
llama_params_fit_impl: trying to fit one extra layer with overflow_type=LAYER_FRACTION_GATE
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 1
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
tensor blk.23.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
tensor blk.23.ffn_down_shexp.weight (0 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
tensor blk.24.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_down_exps.weight
tensor blk.24.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
tensor blk.24.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
tensor blk.25.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_down_exps.weight
tensor blk.25.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
tensor blk.25.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
tensor blk.26.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_down_exps.weight
tensor blk.26.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
tensor blk.26.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
tensor blk.27.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_down_exps.weight
tensor blk.27.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
tensor blk.27.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
tensor blk.28.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_down_exps.weight
tensor blk.28.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
tensor blk.28.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
tensor blk.29.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_down_exps.weight
tensor blk.29.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
tensor blk.29.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
tensor blk.30.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_down_exps.weight
tensor blk.30.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
tensor blk.30.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
tensor blk.31.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_down_exps.weight
tensor blk.31.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
tensor blk.31.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
tensor blk.32.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_down_exps.weight
tensor blk.32.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
tensor blk.32.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
tensor blk.33.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_down_exps.weight
tensor blk.33.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
tensor blk.33.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
tensor blk.34.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_down_exps.weight
tensor blk.34.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
tensor blk.34.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
tensor blk.35.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_down_exps.weight
tensor blk.35.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
tensor blk.35.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
tensor blk.36.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_down_exps.weight
tensor blk.36.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
tensor blk.36.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
tensor blk.37.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_down_exps.weight
tensor blk.37.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
tensor blk.37.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
tensor blk.38.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_down_exps.weight
tensor blk.38.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
tensor blk.38.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
tensor blk.39.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_down_exps.weight
tensor blk.39.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
tensor blk.39.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
tensor blk.40.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_down_exps.weight
tensor blk.40.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
tensor blk.40.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
tensor blk.41.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_down_exps.weight
tensor blk.41.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
tensor blk.41.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
tensor blk.42.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_down_exps.weight
tensor blk.42.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
tensor blk.42.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
tensor blk.43.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_down_exps.weight
tensor blk.43.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
tensor blk.43.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
tensor blk.44.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_down_exps.weight
tensor blk.44.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
tensor blk.44.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
tensor blk.45.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_down_exps.weight
tensor blk.45.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
tensor blk.45.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
tensor blk.46.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_down_exps.weight
tensor blk.46.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
tensor blk.46.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
tensor blk.47.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_down_exps.weight
tensor blk.47.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
tensor blk.47.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 74 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 0.00 MiB
load_tensors: ROCm0 model buffer size = 0.00 MiB
load_tensors: ROCm_Host model buffer size = 0.00 MiB
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 0.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 840.01 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 76 (with bs=512), 54 (with bs=1)
sched_reserve: reserve took 5.75 ms, sched copies = 1
llama_memory_breakdown_print: | memory breakdown [MiB] | total free self model context compute unaccounted |
llama_memory_breakdown_print: | - ROCm0 (MI100) | 32752 = 32510 + (31411 = 27423 + 3147 + 840) + 17592186013246 |
llama_memory_breakdown_print: | - Host | 27048 = 26784 + 0 + 264 |
llama_params_fit_impl: memory for test allocation by device:
llama_params_fit_impl: id=0, n_layer=49, n_part=26, overflow_type=3, mem= 31411 MiB
llama_params_fit_impl: set ngl_per_device[0].(n_layer, n_part, overflow_type)=(49, 26, GATE), id_dense_start=0
llama_params_fit_impl: - ROCm0 (AMD Instinct MI100): 49 layers (26 overflowing), 31411 MiB used, 1098 MiB free
llama_params_fit: successfully fit params to free device memory
llama_params_fit: fitting params to free memory took 1.44 seconds
llama_model_load_from_file_impl: using device ROCm0 (AMD Instinct MI100) (0000:03:00.0) - 32586 MiB free
llama_model_loader: additional 2 GGUFs metadata loaded.
llama_model_loader: loaded meta data with 56 key-value pairs and 843 tensors from /home/edwlan/.cache/llama.cpp/unsloth_Qwen3-Coder-Next-GGUF_Q5_K_M_Qwen3-Coder-Next-Q5_K_M-00001-of-00003.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3next
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.sampling.top_k i32 = 40
llama_model_loader: - kv 3: general.sampling.top_p f32 = 0.950000
llama_model_loader: - kv 4: general.sampling.temp f32 = 1.000000
llama_model_loader: - kv 5: general.name str = Qwen3-Coder-Next
llama_model_loader: - kv 6: general.basename str = Qwen3-Coder-Next
llama_model_loader: - kv 7: general.quantized_by str = Unsloth
llama_model_loader: - kv 8: general.size_label str = 512x2.5B
llama_model_loader: - kv 9: general.license str = apache-2.0
llama_model_loader: - kv 10: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 11: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 12: general.base_model.count u32 = 1
llama_model_loader: - kv 13: general.base_model.0.name str = Qwen3 Coder Next
llama_model_loader: - kv 14: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 15: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 16: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 17: qwen3next.block_count u32 = 48
llama_model_loader: - kv 18: qwen3next.context_length u32 = 262144
llama_model_loader: - kv 19: qwen3next.embedding_length u32 = 2048
llama_model_loader: - kv 20: qwen3next.feed_forward_length u32 = 5120
llama_model_loader: - kv 21: qwen3next.attention.head_count u32 = 16
llama_model_loader: - kv 22: qwen3next.attention.head_count_kv u32 = 2
llama_model_loader: - kv 23: qwen3next.rope.freq_base f32 = 5000000.000000
llama_model_loader: - kv 24: qwen3next.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 25: qwen3next.expert_count u32 = 512
llama_model_loader: - kv 26: qwen3next.expert_used_count u32 = 10
llama_model_loader: - kv 27: qwen3next.attention.key_length u32 = 256
llama_model_loader: - kv 28: qwen3next.attention.value_length u32 = 256
llama_model_loader: - kv 29: qwen3next.expert_feed_forward_length u32 = 512
llama_model_loader: - kv 30: qwen3next.expert_shared_feed_forward_length u32 = 512
llama_model_loader: - kv 31: qwen3next.ssm.conv_kernel u32 = 4
llama_model_loader: - kv 32: qwen3next.ssm.state_size u32 = 128
llama_model_loader: - kv 33: qwen3next.ssm.group_count u32 = 16
llama_model_loader: - kv 34: qwen3next.ssm.time_step_rank u32 = 32
llama_model_loader: - kv 35: qwen3next.ssm.inner_size u32 = 4096
llama_model_loader: - kv 36: qwen3next.full_attention_interval u32 = 4
llama_model_loader: - kv 37: qwen3next.rope.dimension_count u32 = 64
llama_model_loader: - kv 38: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 39: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 40: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 41: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 42: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 43: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 44: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 45: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 46: tokenizer.chat_template str = {% macro render_extra_keys(json_dict,...
llama_model_loader: - kv 47: general.quantization_version u32 = 2
llama_model_loader: - kv 48: general.file_type u32 = 17
llama_model_loader: - kv 49: quantize.imatrix.file str = Qwen3-Coder-Next-GGUF/imatrix_unsloth...
llama_model_loader: - kv 50: quantize.imatrix.dataset str = unsloth_calibration_Qwen3-Coder-Next.txt
llama_model_loader: - kv 51: quantize.imatrix.entries_count u32 = 576
llama_model_loader: - kv 52: quantize.imatrix.chunks_count u32 = 154
llama_model_loader: - kv 53: split.no u16 = 0
llama_model_loader: - kv 54: split.tensors.count i32 = 843
llama_model_loader: - kv 55: split.count u16 = 3
llama_model_loader: - type f32: 361 tensors
llama_model_loader: - type q5_K: 233 tensors
llama_model_loader: - type q6_K: 249 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q5_K - Medium
print_info: file size = 52.94 GiB (5.71 BPW)
init_tokenizer: initializing tokenizer for type 2
load: 0 unused tokens
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control-looking token: 128247 '</s>' was not control-type; this is probably a bug in the model. its type will be overridden
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: printing all EOG tokens:
load: - 128247 ('</s>')
load: - 151643 ('<|endoftext|>')
load: - 151645 ('<|im_end|>')
load: - 151662 ('<|fim_pad|>')
load: - 151663 ('<|repo_name|>')
load: - 151664 ('<|file_sep|>')
load: special tokens cache size = 27
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3next
print_info: vocab_only = 0
print_info: no_alloc = 0
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_embd_inp = 2048
print_info: n_layer = 48
print_info: n_head = 16
print_info: n_head_kv = 2
print_info: n_rot = 64
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 256
print_info: n_embd_head_v = 256
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5120
print_info: n_expert = 512
print_info: n_expert_used = 10
print_info: n_expert_groups = 0
print_info: n_group_used = 0
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 5000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_yarn_log_mul = 0.0000
print_info: rope_finetuned = unknown
print_info: ssm_d_conv = 4
print_info: ssm_d_inner = 4096
print_info: ssm_d_state = 128
print_info: ssm_dt_rank = 32
print_info: ssm_n_group = 16
print_info: ssm_dt_b_c_rms = 0
print_info: model type = 80B.A3B
print_info: model params = 79.67 B
print_info: general.name = Qwen3-Coder-Next
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 128247 '</s>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false, direct_io = false)
load_tensors: layer 0 assigned to device ROCm0, is_swa = 0
load_tensors: layer 1 assigned to device ROCm0, is_swa = 0
load_tensors: layer 2 assigned to device ROCm0, is_swa = 0
load_tensors: layer 3 assigned to device ROCm0, is_swa = 0
load_tensors: layer 4 assigned to device ROCm0, is_swa = 0
load_tensors: layer 5 assigned to device ROCm0, is_swa = 0
load_tensors: layer 6 assigned to device ROCm0, is_swa = 0
load_tensors: layer 7 assigned to device ROCm0, is_swa = 0
load_tensors: layer 8 assigned to device ROCm0, is_swa = 0
load_tensors: layer 9 assigned to device ROCm0, is_swa = 0
load_tensors: layer 10 assigned to device ROCm0, is_swa = 0
load_tensors: layer 11 assigned to device ROCm0, is_swa = 0
load_tensors: layer 12 assigned to device ROCm0, is_swa = 0
load_tensors: layer 13 assigned to device ROCm0, is_swa = 0
load_tensors: layer 14 assigned to device ROCm0, is_swa = 0
load_tensors: layer 15 assigned to device ROCm0, is_swa = 0
load_tensors: layer 16 assigned to device ROCm0, is_swa = 0
load_tensors: layer 17 assigned to device ROCm0, is_swa = 0
load_tensors: layer 18 assigned to device ROCm0, is_swa = 0
load_tensors: layer 19 assigned to device ROCm0, is_swa = 0
load_tensors: layer 20 assigned to device ROCm0, is_swa = 0
load_tensors: layer 21 assigned to device ROCm0, is_swa = 0
load_tensors: layer 22 assigned to device ROCm0, is_swa = 0
load_tensors: layer 23 assigned to device ROCm0, is_swa = 0
load_tensors: layer 24 assigned to device ROCm0, is_swa = 0
load_tensors: layer 25 assigned to device ROCm0, is_swa = 0
load_tensors: layer 26 assigned to device ROCm0, is_swa = 0
load_tensors: layer 27 assigned to device ROCm0, is_swa = 0
load_tensors: layer 28 assigned to device ROCm0, is_swa = 0
load_tensors: layer 29 assigned to device ROCm0, is_swa = 0
load_tensors: layer 30 assigned to device ROCm0, is_swa = 0
load_tensors: layer 31 assigned to device ROCm0, is_swa = 0
load_tensors: layer 32 assigned to device ROCm0, is_swa = 0
load_tensors: layer 33 assigned to device ROCm0, is_swa = 0
load_tensors: layer 34 assigned to device ROCm0, is_swa = 0
load_tensors: layer 35 assigned to device ROCm0, is_swa = 0
load_tensors: layer 36 assigned to device ROCm0, is_swa = 0
load_tensors: layer 37 assigned to device ROCm0, is_swa = 0
load_tensors: layer 38 assigned to device ROCm0, is_swa = 0
load_tensors: layer 39 assigned to device ROCm0, is_swa = 0
load_tensors: layer 40 assigned to device ROCm0, is_swa = 0
load_tensors: layer 41 assigned to device ROCm0, is_swa = 0
load_tensors: layer 42 assigned to device ROCm0, is_swa = 0
load_tensors: layer 43 assigned to device ROCm0, is_swa = 0
load_tensors: layer 44 assigned to device ROCm0, is_swa = 0
load_tensors: layer 45 assigned to device ROCm0, is_swa = 0
load_tensors: layer 46 assigned to device ROCm0, is_swa = 0
load_tensors: layer 47 assigned to device ROCm0, is_swa = 0
load_tensors: layer 48 assigned to device ROCm0, is_swa = 0
create_tensor: loading tensor token_embd.weight
create_tensor: loading tensor output_norm.weight
create_tensor: loading tensor output.weight
create_tensor: loading tensor blk.0.attn_norm.weight
create_tensor: loading tensor blk.0.post_attention_norm.weight
create_tensor: loading tensor blk.0.attn_qkv.weight
create_tensor: loading tensor blk.0.attn_gate.weight
create_tensor: loading tensor blk.0.ssm_conv1d.weight
create_tensor: loading tensor blk.0.ssm_dt.bias
create_tensor: loading tensor blk.0.ssm_a
create_tensor: loading tensor blk.0.ssm_ba.weight
create_tensor: loading tensor blk.0.ssm_norm.weight
create_tensor: loading tensor blk.0.ssm_out.weight
create_tensor: loading tensor blk.0.ffn_gate_inp.weight
create_tensor: loading tensor blk.0.ffn_down_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_exps.weight
create_tensor: loading tensor blk.0.ffn_up_exps.weight
create_tensor: loading tensor blk.0.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.0.ffn_gate_shexp.weight
create_tensor: loading tensor blk.0.ffn_up_shexp.weight
create_tensor: loading tensor blk.0.ffn_down_shexp.weight
create_tensor: loading tensor blk.1.attn_norm.weight
create_tensor: loading tensor blk.1.post_attention_norm.weight
create_tensor: loading tensor blk.1.attn_qkv.weight
create_tensor: loading tensor blk.1.attn_gate.weight
create_tensor: loading tensor blk.1.ssm_conv1d.weight
create_tensor: loading tensor blk.1.ssm_dt.bias
create_tensor: loading tensor blk.1.ssm_a
create_tensor: loading tensor blk.1.ssm_ba.weight
create_tensor: loading tensor blk.1.ssm_norm.weight
create_tensor: loading tensor blk.1.ssm_out.weight
create_tensor: loading tensor blk.1.ffn_gate_inp.weight
create_tensor: loading tensor blk.1.ffn_down_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_exps.weight
create_tensor: loading tensor blk.1.ffn_up_exps.weight
create_tensor: loading tensor blk.1.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.1.ffn_gate_shexp.weight
create_tensor: loading tensor blk.1.ffn_up_shexp.weight
create_tensor: loading tensor blk.1.ffn_down_shexp.weight
create_tensor: loading tensor blk.2.attn_norm.weight
create_tensor: loading tensor blk.2.post_attention_norm.weight
create_tensor: loading tensor blk.2.attn_qkv.weight
create_tensor: loading tensor blk.2.attn_gate.weight
create_tensor: loading tensor blk.2.ssm_conv1d.weight
create_tensor: loading tensor blk.2.ssm_dt.bias
create_tensor: loading tensor blk.2.ssm_a
create_tensor: loading tensor blk.2.ssm_ba.weight
create_tensor: loading tensor blk.2.ssm_norm.weight
create_tensor: loading tensor blk.2.ssm_out.weight
create_tensor: loading tensor blk.2.ffn_gate_inp.weight
create_tensor: loading tensor blk.2.ffn_down_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_exps.weight
create_tensor: loading tensor blk.2.ffn_up_exps.weight
create_tensor: loading tensor blk.2.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.2.ffn_gate_shexp.weight
create_tensor: loading tensor blk.2.ffn_up_shexp.weight
create_tensor: loading tensor blk.2.ffn_down_shexp.weight
create_tensor: loading tensor blk.3.attn_norm.weight
create_tensor: loading tensor blk.3.post_attention_norm.weight
create_tensor: loading tensor blk.3.attn_q.weight
create_tensor: loading tensor blk.3.attn_k.weight
create_tensor: loading tensor blk.3.attn_v.weight
create_tensor: loading tensor blk.3.attn_output.weight
create_tensor: loading tensor blk.3.attn_q_norm.weight
create_tensor: loading tensor blk.3.attn_k_norm.weight
create_tensor: loading tensor blk.3.ffn_gate_inp.weight
create_tensor: loading tensor blk.3.ffn_down_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_exps.weight
create_tensor: loading tensor blk.3.ffn_up_exps.weight
create_tensor: loading tensor blk.3.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.3.ffn_gate_shexp.weight
create_tensor: loading tensor blk.3.ffn_up_shexp.weight
create_tensor: loading tensor blk.3.ffn_down_shexp.weight
create_tensor: loading tensor blk.4.attn_norm.weight
create_tensor: loading tensor blk.4.post_attention_norm.weight
create_tensor: loading tensor blk.4.attn_qkv.weight
create_tensor: loading tensor blk.4.attn_gate.weight
create_tensor: loading tensor blk.4.ssm_conv1d.weight
create_tensor: loading tensor blk.4.ssm_dt.bias
create_tensor: loading tensor blk.4.ssm_a
create_tensor: loading tensor blk.4.ssm_ba.weight
create_tensor: loading tensor blk.4.ssm_norm.weight
create_tensor: loading tensor blk.4.ssm_out.weight
create_tensor: loading tensor blk.4.ffn_gate_inp.weight
create_tensor: loading tensor blk.4.ffn_down_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_exps.weight
create_tensor: loading tensor blk.4.ffn_up_exps.weight
create_tensor: loading tensor blk.4.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.4.ffn_gate_shexp.weight
create_tensor: loading tensor blk.4.ffn_up_shexp.weight
create_tensor: loading tensor blk.4.ffn_down_shexp.weight
create_tensor: loading tensor blk.5.attn_norm.weight
create_tensor: loading tensor blk.5.post_attention_norm.weight
create_tensor: loading tensor blk.5.attn_qkv.weight
create_tensor: loading tensor blk.5.attn_gate.weight
create_tensor: loading tensor blk.5.ssm_conv1d.weight
create_tensor: loading tensor blk.5.ssm_dt.bias
create_tensor: loading tensor blk.5.ssm_a
create_tensor: loading tensor blk.5.ssm_ba.weight
create_tensor: loading tensor blk.5.ssm_norm.weight
create_tensor: loading tensor blk.5.ssm_out.weight
create_tensor: loading tensor blk.5.ffn_gate_inp.weight
create_tensor: loading tensor blk.5.ffn_down_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_exps.weight
create_tensor: loading tensor blk.5.ffn_up_exps.weight
create_tensor: loading tensor blk.5.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.5.ffn_gate_shexp.weight
create_tensor: loading tensor blk.5.ffn_up_shexp.weight
create_tensor: loading tensor blk.5.ffn_down_shexp.weight
create_tensor: loading tensor blk.6.attn_norm.weight
create_tensor: loading tensor blk.6.post_attention_norm.weight
create_tensor: loading tensor blk.6.attn_qkv.weight
create_tensor: loading tensor blk.6.attn_gate.weight
create_tensor: loading tensor blk.6.ssm_conv1d.weight
create_tensor: loading tensor blk.6.ssm_dt.bias
create_tensor: loading tensor blk.6.ssm_a
create_tensor: loading tensor blk.6.ssm_ba.weight
create_tensor: loading tensor blk.6.ssm_norm.weight
create_tensor: loading tensor blk.6.ssm_out.weight
create_tensor: loading tensor blk.6.ffn_gate_inp.weight
create_tensor: loading tensor blk.6.ffn_down_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_exps.weight
create_tensor: loading tensor blk.6.ffn_up_exps.weight
create_tensor: loading tensor blk.6.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.6.ffn_gate_shexp.weight
create_tensor: loading tensor blk.6.ffn_up_shexp.weight
create_tensor: loading tensor blk.6.ffn_down_shexp.weight
create_tensor: loading tensor blk.7.attn_norm.weight
create_tensor: loading tensor blk.7.post_attention_norm.weight
create_tensor: loading tensor blk.7.attn_q.weight
create_tensor: loading tensor blk.7.attn_k.weight
create_tensor: loading tensor blk.7.attn_v.weight
create_tensor: loading tensor blk.7.attn_output.weight
create_tensor: loading tensor blk.7.attn_q_norm.weight
create_tensor: loading tensor blk.7.attn_k_norm.weight
create_tensor: loading tensor blk.7.ffn_gate_inp.weight
create_tensor: loading tensor blk.7.ffn_down_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_exps.weight
create_tensor: loading tensor blk.7.ffn_up_exps.weight
create_tensor: loading tensor blk.7.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.7.ffn_gate_shexp.weight
create_tensor: loading tensor blk.7.ffn_up_shexp.weight
create_tensor: loading tensor blk.7.ffn_down_shexp.weight
create_tensor: loading tensor blk.8.attn_norm.weight
create_tensor: loading tensor blk.8.post_attention_norm.weight
create_tensor: loading tensor blk.8.attn_qkv.weight
create_tensor: loading tensor blk.8.attn_gate.weight
create_tensor: loading tensor blk.8.ssm_conv1d.weight
create_tensor: loading tensor blk.8.ssm_dt.bias
create_tensor: loading tensor blk.8.ssm_a
create_tensor: loading tensor blk.8.ssm_ba.weight
create_tensor: loading tensor blk.8.ssm_norm.weight
create_tensor: loading tensor blk.8.ssm_out.weight
create_tensor: loading tensor blk.8.ffn_gate_inp.weight
create_tensor: loading tensor blk.8.ffn_down_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_exps.weight
create_tensor: loading tensor blk.8.ffn_up_exps.weight
create_tensor: loading tensor blk.8.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.8.ffn_gate_shexp.weight
create_tensor: loading tensor blk.8.ffn_up_shexp.weight
create_tensor: loading tensor blk.8.ffn_down_shexp.weight
create_tensor: loading tensor blk.9.attn_norm.weight
create_tensor: loading tensor blk.9.post_attention_norm.weight
create_tensor: loading tensor blk.9.attn_qkv.weight
create_tensor: loading tensor blk.9.attn_gate.weight
create_tensor: loading tensor blk.9.ssm_conv1d.weight
create_tensor: loading tensor blk.9.ssm_dt.bias
create_tensor: loading tensor blk.9.ssm_a
create_tensor: loading tensor blk.9.ssm_ba.weight
create_tensor: loading tensor blk.9.ssm_norm.weight
create_tensor: loading tensor blk.9.ssm_out.weight
create_tensor: loading tensor blk.9.ffn_gate_inp.weight
create_tensor: loading tensor blk.9.ffn_down_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_exps.weight
create_tensor: loading tensor blk.9.ffn_up_exps.weight
create_tensor: loading tensor blk.9.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.9.ffn_gate_shexp.weight
create_tensor: loading tensor blk.9.ffn_up_shexp.weight
create_tensor: loading tensor blk.9.ffn_down_shexp.weight
create_tensor: loading tensor blk.10.attn_norm.weight
create_tensor: loading tensor blk.10.post_attention_norm.weight
create_tensor: loading tensor blk.10.attn_qkv.weight
create_tensor: loading tensor blk.10.attn_gate.weight
create_tensor: loading tensor blk.10.ssm_conv1d.weight
create_tensor: loading tensor blk.10.ssm_dt.bias
create_tensor: loading tensor blk.10.ssm_a
create_tensor: loading tensor blk.10.ssm_ba.weight
create_tensor: loading tensor blk.10.ssm_norm.weight
create_tensor: loading tensor blk.10.ssm_out.weight
create_tensor: loading tensor blk.10.ffn_gate_inp.weight
create_tensor: loading tensor blk.10.ffn_down_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_exps.weight
create_tensor: loading tensor blk.10.ffn_up_exps.weight
create_tensor: loading tensor blk.10.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.10.ffn_gate_shexp.weight
create_tensor: loading tensor blk.10.ffn_up_shexp.weight
create_tensor: loading tensor blk.10.ffn_down_shexp.weight
create_tensor: loading tensor blk.11.attn_norm.weight
create_tensor: loading tensor blk.11.post_attention_norm.weight
create_tensor: loading tensor blk.11.attn_q.weight
create_tensor: loading tensor blk.11.attn_k.weight
create_tensor: loading tensor blk.11.attn_v.weight
create_tensor: loading tensor blk.11.attn_output.weight
create_tensor: loading tensor blk.11.attn_q_norm.weight
create_tensor: loading tensor blk.11.attn_k_norm.weight
create_tensor: loading tensor blk.11.ffn_gate_inp.weight
create_tensor: loading tensor blk.11.ffn_down_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_exps.weight
create_tensor: loading tensor blk.11.ffn_up_exps.weight
create_tensor: loading tensor blk.11.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.11.ffn_gate_shexp.weight
create_tensor: loading tensor blk.11.ffn_up_shexp.weight
create_tensor: loading tensor blk.11.ffn_down_shexp.weight
create_tensor: loading tensor blk.12.attn_norm.weight
create_tensor: loading tensor blk.12.post_attention_norm.weight
create_tensor: loading tensor blk.12.attn_qkv.weight
create_tensor: loading tensor blk.12.attn_gate.weight
create_tensor: loading tensor blk.12.ssm_conv1d.weight
create_tensor: loading tensor blk.12.ssm_dt.bias
create_tensor: loading tensor blk.12.ssm_a
create_tensor: loading tensor blk.12.ssm_ba.weight
create_tensor: loading tensor blk.12.ssm_norm.weight
create_tensor: loading tensor blk.12.ssm_out.weight
create_tensor: loading tensor blk.12.ffn_gate_inp.weight
create_tensor: loading tensor blk.12.ffn_down_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_exps.weight
create_tensor: loading tensor blk.12.ffn_up_exps.weight
create_tensor: loading tensor blk.12.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.12.ffn_gate_shexp.weight
create_tensor: loading tensor blk.12.ffn_up_shexp.weight
create_tensor: loading tensor blk.12.ffn_down_shexp.weight
create_tensor: loading tensor blk.13.attn_norm.weight
create_tensor: loading tensor blk.13.post_attention_norm.weight
create_tensor: loading tensor blk.13.attn_qkv.weight
create_tensor: loading tensor blk.13.attn_gate.weight
create_tensor: loading tensor blk.13.ssm_conv1d.weight
create_tensor: loading tensor blk.13.ssm_dt.bias
create_tensor: loading tensor blk.13.ssm_a
create_tensor: loading tensor blk.13.ssm_ba.weight
create_tensor: loading tensor blk.13.ssm_norm.weight
create_tensor: loading tensor blk.13.ssm_out.weight
create_tensor: loading tensor blk.13.ffn_gate_inp.weight
create_tensor: loading tensor blk.13.ffn_down_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_exps.weight
create_tensor: loading tensor blk.13.ffn_up_exps.weight
create_tensor: loading tensor blk.13.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.13.ffn_gate_shexp.weight
create_tensor: loading tensor blk.13.ffn_up_shexp.weight
create_tensor: loading tensor blk.13.ffn_down_shexp.weight
create_tensor: loading tensor blk.14.attn_norm.weight
create_tensor: loading tensor blk.14.post_attention_norm.weight
create_tensor: loading tensor blk.14.attn_qkv.weight
create_tensor: loading tensor blk.14.attn_gate.weight
create_tensor: loading tensor blk.14.ssm_conv1d.weight
create_tensor: loading tensor blk.14.ssm_dt.bias
create_tensor: loading tensor blk.14.ssm_a
create_tensor: loading tensor blk.14.ssm_ba.weight
create_tensor: loading tensor blk.14.ssm_norm.weight
create_tensor: loading tensor blk.14.ssm_out.weight
create_tensor: loading tensor blk.14.ffn_gate_inp.weight
create_tensor: loading tensor blk.14.ffn_down_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_exps.weight
create_tensor: loading tensor blk.14.ffn_up_exps.weight
create_tensor: loading tensor blk.14.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.14.ffn_gate_shexp.weight
create_tensor: loading tensor blk.14.ffn_up_shexp.weight
create_tensor: loading tensor blk.14.ffn_down_shexp.weight
create_tensor: loading tensor blk.15.attn_norm.weight
create_tensor: loading tensor blk.15.post_attention_norm.weight
create_tensor: loading tensor blk.15.attn_q.weight
create_tensor: loading tensor blk.15.attn_k.weight
create_tensor: loading tensor blk.15.attn_v.weight
create_tensor: loading tensor blk.15.attn_output.weight
create_tensor: loading tensor blk.15.attn_q_norm.weight
create_tensor: loading tensor blk.15.attn_k_norm.weight
create_tensor: loading tensor blk.15.ffn_gate_inp.weight
create_tensor: loading tensor blk.15.ffn_down_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_exps.weight
create_tensor: loading tensor blk.15.ffn_up_exps.weight
create_tensor: loading tensor blk.15.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.15.ffn_gate_shexp.weight
create_tensor: loading tensor blk.15.ffn_up_shexp.weight
create_tensor: loading tensor blk.15.ffn_down_shexp.weight
create_tensor: loading tensor blk.16.attn_norm.weight
create_tensor: loading tensor blk.16.post_attention_norm.weight
create_tensor: loading tensor blk.16.attn_qkv.weight
create_tensor: loading tensor blk.16.attn_gate.weight
create_tensor: loading tensor blk.16.ssm_conv1d.weight
create_tensor: loading tensor blk.16.ssm_dt.bias
create_tensor: loading tensor blk.16.ssm_a
create_tensor: loading tensor blk.16.ssm_ba.weight
create_tensor: loading tensor blk.16.ssm_norm.weight
create_tensor: loading tensor blk.16.ssm_out.weight
create_tensor: loading tensor blk.16.ffn_gate_inp.weight
create_tensor: loading tensor blk.16.ffn_down_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_exps.weight
create_tensor: loading tensor blk.16.ffn_up_exps.weight
create_tensor: loading tensor blk.16.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.16.ffn_gate_shexp.weight
create_tensor: loading tensor blk.16.ffn_up_shexp.weight
create_tensor: loading tensor blk.16.ffn_down_shexp.weight
create_tensor: loading tensor blk.17.attn_norm.weight
create_tensor: loading tensor blk.17.post_attention_norm.weight
create_tensor: loading tensor blk.17.attn_qkv.weight
create_tensor: loading tensor blk.17.attn_gate.weight
create_tensor: loading tensor blk.17.ssm_conv1d.weight
create_tensor: loading tensor blk.17.ssm_dt.bias
create_tensor: loading tensor blk.17.ssm_a
create_tensor: loading tensor blk.17.ssm_ba.weight
create_tensor: loading tensor blk.17.ssm_norm.weight
create_tensor: loading tensor blk.17.ssm_out.weight
create_tensor: loading tensor blk.17.ffn_gate_inp.weight
create_tensor: loading tensor blk.17.ffn_down_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_exps.weight
create_tensor: loading tensor blk.17.ffn_up_exps.weight
create_tensor: loading tensor blk.17.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.17.ffn_gate_shexp.weight
create_tensor: loading tensor blk.17.ffn_up_shexp.weight
create_tensor: loading tensor blk.17.ffn_down_shexp.weight
create_tensor: loading tensor blk.18.attn_norm.weight
create_tensor: loading tensor blk.18.post_attention_norm.weight
create_tensor: loading tensor blk.18.attn_qkv.weight
create_tensor: loading tensor blk.18.attn_gate.weight
create_tensor: loading tensor blk.18.ssm_conv1d.weight
create_tensor: loading tensor blk.18.ssm_dt.bias
create_tensor: loading tensor blk.18.ssm_a
create_tensor: loading tensor blk.18.ssm_ba.weight
create_tensor: loading tensor blk.18.ssm_norm.weight
create_tensor: loading tensor blk.18.ssm_out.weight
create_tensor: loading tensor blk.18.ffn_gate_inp.weight
create_tensor: loading tensor blk.18.ffn_down_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_exps.weight
create_tensor: loading tensor blk.18.ffn_up_exps.weight
create_tensor: loading tensor blk.18.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.18.ffn_gate_shexp.weight
create_tensor: loading tensor blk.18.ffn_up_shexp.weight
create_tensor: loading tensor blk.18.ffn_down_shexp.weight
create_tensor: loading tensor blk.19.attn_norm.weight
create_tensor: loading tensor blk.19.post_attention_norm.weight
create_tensor: loading tensor blk.19.attn_q.weight
create_tensor: loading tensor blk.19.attn_k.weight
create_tensor: loading tensor blk.19.attn_v.weight
create_tensor: loading tensor blk.19.attn_output.weight
create_tensor: loading tensor blk.19.attn_q_norm.weight
create_tensor: loading tensor blk.19.attn_k_norm.weight
create_tensor: loading tensor blk.19.ffn_gate_inp.weight
create_tensor: loading tensor blk.19.ffn_down_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_exps.weight
create_tensor: loading tensor blk.19.ffn_up_exps.weight
create_tensor: loading tensor blk.19.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.19.ffn_gate_shexp.weight
create_tensor: loading tensor blk.19.ffn_up_shexp.weight
create_tensor: loading tensor blk.19.ffn_down_shexp.weight
create_tensor: loading tensor blk.20.attn_norm.weight
create_tensor: loading tensor blk.20.post_attention_norm.weight
create_tensor: loading tensor blk.20.attn_qkv.weight
create_tensor: loading tensor blk.20.attn_gate.weight
create_tensor: loading tensor blk.20.ssm_conv1d.weight
create_tensor: loading tensor blk.20.ssm_dt.bias
create_tensor: loading tensor blk.20.ssm_a
create_tensor: loading tensor blk.20.ssm_ba.weight
create_tensor: loading tensor blk.20.ssm_norm.weight
create_tensor: loading tensor blk.20.ssm_out.weight
create_tensor: loading tensor blk.20.ffn_gate_inp.weight
create_tensor: loading tensor blk.20.ffn_down_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_exps.weight
create_tensor: loading tensor blk.20.ffn_up_exps.weight
create_tensor: loading tensor blk.20.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.20.ffn_gate_shexp.weight
create_tensor: loading tensor blk.20.ffn_up_shexp.weight
create_tensor: loading tensor blk.20.ffn_down_shexp.weight
create_tensor: loading tensor blk.21.attn_norm.weight
create_tensor: loading tensor blk.21.post_attention_norm.weight
create_tensor: loading tensor blk.21.attn_qkv.weight
create_tensor: loading tensor blk.21.attn_gate.weight
create_tensor: loading tensor blk.21.ssm_conv1d.weight
create_tensor: loading tensor blk.21.ssm_dt.bias
create_tensor: loading tensor blk.21.ssm_a
create_tensor: loading tensor blk.21.ssm_ba.weight
create_tensor: loading tensor blk.21.ssm_norm.weight
create_tensor: loading tensor blk.21.ssm_out.weight
create_tensor: loading tensor blk.21.ffn_gate_inp.weight
create_tensor: loading tensor blk.21.ffn_down_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_exps.weight
create_tensor: loading tensor blk.21.ffn_up_exps.weight
create_tensor: loading tensor blk.21.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.21.ffn_gate_shexp.weight
create_tensor: loading tensor blk.21.ffn_up_shexp.weight
create_tensor: loading tensor blk.21.ffn_down_shexp.weight
create_tensor: loading tensor blk.22.attn_norm.weight
create_tensor: loading tensor blk.22.post_attention_norm.weight
create_tensor: loading tensor blk.22.attn_qkv.weight
create_tensor: loading tensor blk.22.attn_gate.weight
create_tensor: loading tensor blk.22.ssm_conv1d.weight
create_tensor: loading tensor blk.22.ssm_dt.bias
create_tensor: loading tensor blk.22.ssm_a
create_tensor: loading tensor blk.22.ssm_ba.weight
create_tensor: loading tensor blk.22.ssm_norm.weight
create_tensor: loading tensor blk.22.ssm_out.weight
create_tensor: loading tensor blk.22.ffn_gate_inp.weight
create_tensor: loading tensor blk.22.ffn_down_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_exps.weight
create_tensor: loading tensor blk.22.ffn_up_exps.weight
create_tensor: loading tensor blk.22.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.22.ffn_gate_shexp.weight
create_tensor: loading tensor blk.22.ffn_up_shexp.weight
create_tensor: loading tensor blk.22.ffn_down_shexp.weight
create_tensor: loading tensor blk.23.attn_norm.weight
create_tensor: loading tensor blk.23.post_attention_norm.weight
create_tensor: loading tensor blk.23.attn_q.weight
create_tensor: loading tensor blk.23.attn_k.weight
create_tensor: loading tensor blk.23.attn_v.weight
create_tensor: loading tensor blk.23.attn_output.weight
create_tensor: loading tensor blk.23.attn_q_norm.weight
create_tensor: loading tensor blk.23.attn_k_norm.weight
create_tensor: loading tensor blk.23.ffn_gate_inp.weight
tensor blk.23.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_exps.weight
create_tensor: loading tensor blk.23.ffn_up_exps.weight
create_tensor: loading tensor blk.23.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.23.ffn_gate_shexp.weight
create_tensor: loading tensor blk.23.ffn_up_shexp.weight
tensor blk.23.ffn_down_shexp.weight (0 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.23.ffn_down_shexp.weight
create_tensor: loading tensor blk.24.attn_norm.weight
create_tensor: loading tensor blk.24.post_attention_norm.weight
create_tensor: loading tensor blk.24.attn_qkv.weight
create_tensor: loading tensor blk.24.attn_gate.weight
create_tensor: loading tensor blk.24.ssm_conv1d.weight
create_tensor: loading tensor blk.24.ssm_dt.bias
create_tensor: loading tensor blk.24.ssm_a
create_tensor: loading tensor blk.24.ssm_ba.weight
create_tensor: loading tensor blk.24.ssm_norm.weight
create_tensor: loading tensor blk.24.ssm_out.weight
create_tensor: loading tensor blk.24.ffn_gate_inp.weight
tensor blk.24.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_down_exps.weight
tensor blk.24.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_gate_exps.weight
tensor blk.24.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.24.ffn_up_exps.weight
create_tensor: loading tensor blk.24.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.24.ffn_gate_shexp.weight
create_tensor: loading tensor blk.24.ffn_up_shexp.weight
create_tensor: loading tensor blk.24.ffn_down_shexp.weight
create_tensor: loading tensor blk.25.attn_norm.weight
create_tensor: loading tensor blk.25.post_attention_norm.weight
create_tensor: loading tensor blk.25.attn_qkv.weight
create_tensor: loading tensor blk.25.attn_gate.weight
create_tensor: loading tensor blk.25.ssm_conv1d.weight
create_tensor: loading tensor blk.25.ssm_dt.bias
create_tensor: loading tensor blk.25.ssm_a
create_tensor: loading tensor blk.25.ssm_ba.weight
create_tensor: loading tensor blk.25.ssm_norm.weight
create_tensor: loading tensor blk.25.ssm_out.weight
create_tensor: loading tensor blk.25.ffn_gate_inp.weight
tensor blk.25.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_down_exps.weight
tensor blk.25.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_gate_exps.weight
tensor blk.25.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.25.ffn_up_exps.weight
create_tensor: loading tensor blk.25.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.25.ffn_gate_shexp.weight
create_tensor: loading tensor blk.25.ffn_up_shexp.weight
create_tensor: loading tensor blk.25.ffn_down_shexp.weight
create_tensor: loading tensor blk.26.attn_norm.weight
create_tensor: loading tensor blk.26.post_attention_norm.weight
create_tensor: loading tensor blk.26.attn_qkv.weight
create_tensor: loading tensor blk.26.attn_gate.weight
create_tensor: loading tensor blk.26.ssm_conv1d.weight
create_tensor: loading tensor blk.26.ssm_dt.bias
create_tensor: loading tensor blk.26.ssm_a
create_tensor: loading tensor blk.26.ssm_ba.weight
create_tensor: loading tensor blk.26.ssm_norm.weight
create_tensor: loading tensor blk.26.ssm_out.weight
create_tensor: loading tensor blk.26.ffn_gate_inp.weight
tensor blk.26.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_down_exps.weight
tensor blk.26.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_gate_exps.weight
tensor blk.26.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.26.ffn_up_exps.weight
create_tensor: loading tensor blk.26.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.26.ffn_gate_shexp.weight
create_tensor: loading tensor blk.26.ffn_up_shexp.weight
create_tensor: loading tensor blk.26.ffn_down_shexp.weight
create_tensor: loading tensor blk.27.attn_norm.weight
create_tensor: loading tensor blk.27.post_attention_norm.weight
create_tensor: loading tensor blk.27.attn_q.weight
create_tensor: loading tensor blk.27.attn_k.weight
create_tensor: loading tensor blk.27.attn_v.weight
create_tensor: loading tensor blk.27.attn_output.weight
create_tensor: loading tensor blk.27.attn_q_norm.weight
create_tensor: loading tensor blk.27.attn_k_norm.weight
create_tensor: loading tensor blk.27.ffn_gate_inp.weight
tensor blk.27.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_down_exps.weight
tensor blk.27.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_gate_exps.weight
tensor blk.27.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.27.ffn_up_exps.weight
create_tensor: loading tensor blk.27.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.27.ffn_gate_shexp.weight
create_tensor: loading tensor blk.27.ffn_up_shexp.weight
create_tensor: loading tensor blk.27.ffn_down_shexp.weight
create_tensor: loading tensor blk.28.attn_norm.weight
create_tensor: loading tensor blk.28.post_attention_norm.weight
create_tensor: loading tensor blk.28.attn_qkv.weight
create_tensor: loading tensor blk.28.attn_gate.weight
create_tensor: loading tensor blk.28.ssm_conv1d.weight
create_tensor: loading tensor blk.28.ssm_dt.bias
create_tensor: loading tensor blk.28.ssm_a
create_tensor: loading tensor blk.28.ssm_ba.weight
create_tensor: loading tensor blk.28.ssm_norm.weight
create_tensor: loading tensor blk.28.ssm_out.weight
create_tensor: loading tensor blk.28.ffn_gate_inp.weight
tensor blk.28.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_down_exps.weight
tensor blk.28.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_gate_exps.weight
tensor blk.28.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.28.ffn_up_exps.weight
create_tensor: loading tensor blk.28.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.28.ffn_gate_shexp.weight
create_tensor: loading tensor blk.28.ffn_up_shexp.weight
create_tensor: loading tensor blk.28.ffn_down_shexp.weight
create_tensor: loading tensor blk.29.attn_norm.weight
create_tensor: loading tensor blk.29.post_attention_norm.weight
create_tensor: loading tensor blk.29.attn_qkv.weight
create_tensor: loading tensor blk.29.attn_gate.weight
create_tensor: loading tensor blk.29.ssm_conv1d.weight
create_tensor: loading tensor blk.29.ssm_dt.bias
create_tensor: loading tensor blk.29.ssm_a
create_tensor: loading tensor blk.29.ssm_ba.weight
create_tensor: loading tensor blk.29.ssm_norm.weight
create_tensor: loading tensor blk.29.ssm_out.weight
create_tensor: loading tensor blk.29.ffn_gate_inp.weight
tensor blk.29.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_down_exps.weight
tensor blk.29.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_gate_exps.weight
tensor blk.29.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.29.ffn_up_exps.weight
create_tensor: loading tensor blk.29.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.29.ffn_gate_shexp.weight
create_tensor: loading tensor blk.29.ffn_up_shexp.weight
create_tensor: loading tensor blk.29.ffn_down_shexp.weight
create_tensor: loading tensor blk.30.attn_norm.weight
create_tensor: loading tensor blk.30.post_attention_norm.weight
create_tensor: loading tensor blk.30.attn_qkv.weight
create_tensor: loading tensor blk.30.attn_gate.weight
create_tensor: loading tensor blk.30.ssm_conv1d.weight
create_tensor: loading tensor blk.30.ssm_dt.bias
create_tensor: loading tensor blk.30.ssm_a
create_tensor: loading tensor blk.30.ssm_ba.weight
create_tensor: loading tensor blk.30.ssm_norm.weight
create_tensor: loading tensor blk.30.ssm_out.weight
create_tensor: loading tensor blk.30.ffn_gate_inp.weight
tensor blk.30.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_down_exps.weight
tensor blk.30.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_gate_exps.weight
tensor blk.30.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.30.ffn_up_exps.weight
create_tensor: loading tensor blk.30.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.30.ffn_gate_shexp.weight
create_tensor: loading tensor blk.30.ffn_up_shexp.weight
create_tensor: loading tensor blk.30.ffn_down_shexp.weight
create_tensor: loading tensor blk.31.attn_norm.weight
create_tensor: loading tensor blk.31.post_attention_norm.weight
create_tensor: loading tensor blk.31.attn_q.weight
create_tensor: loading tensor blk.31.attn_k.weight
create_tensor: loading tensor blk.31.attn_v.weight
create_tensor: loading tensor blk.31.attn_output.weight
create_tensor: loading tensor blk.31.attn_q_norm.weight
create_tensor: loading tensor blk.31.attn_k_norm.weight
create_tensor: loading tensor blk.31.ffn_gate_inp.weight
tensor blk.31.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_down_exps.weight
tensor blk.31.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_gate_exps.weight
tensor blk.31.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.31.ffn_up_exps.weight
create_tensor: loading tensor blk.31.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.31.ffn_gate_shexp.weight
create_tensor: loading tensor blk.31.ffn_up_shexp.weight
create_tensor: loading tensor blk.31.ffn_down_shexp.weight
create_tensor: loading tensor blk.32.attn_norm.weight
create_tensor: loading tensor blk.32.post_attention_norm.weight
create_tensor: loading tensor blk.32.attn_qkv.weight
create_tensor: loading tensor blk.32.attn_gate.weight
create_tensor: loading tensor blk.32.ssm_conv1d.weight
create_tensor: loading tensor blk.32.ssm_dt.bias
create_tensor: loading tensor blk.32.ssm_a
create_tensor: loading tensor blk.32.ssm_ba.weight
create_tensor: loading tensor blk.32.ssm_norm.weight
create_tensor: loading tensor blk.32.ssm_out.weight
create_tensor: loading tensor blk.32.ffn_gate_inp.weight
tensor blk.32.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_down_exps.weight
tensor blk.32.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_gate_exps.weight
tensor blk.32.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.32.ffn_up_exps.weight
create_tensor: loading tensor blk.32.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.32.ffn_gate_shexp.weight
create_tensor: loading tensor blk.32.ffn_up_shexp.weight
create_tensor: loading tensor blk.32.ffn_down_shexp.weight
create_tensor: loading tensor blk.33.attn_norm.weight
create_tensor: loading tensor blk.33.post_attention_norm.weight
create_tensor: loading tensor blk.33.attn_qkv.weight
create_tensor: loading tensor blk.33.attn_gate.weight
create_tensor: loading tensor blk.33.ssm_conv1d.weight
create_tensor: loading tensor blk.33.ssm_dt.bias
create_tensor: loading tensor blk.33.ssm_a
create_tensor: loading tensor blk.33.ssm_ba.weight
create_tensor: loading tensor blk.33.ssm_norm.weight
create_tensor: loading tensor blk.33.ssm_out.weight
create_tensor: loading tensor blk.33.ffn_gate_inp.weight
tensor blk.33.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_down_exps.weight
tensor blk.33.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_gate_exps.weight
tensor blk.33.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.33.ffn_up_exps.weight
create_tensor: loading tensor blk.33.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.33.ffn_gate_shexp.weight
create_tensor: loading tensor blk.33.ffn_up_shexp.weight
create_tensor: loading tensor blk.33.ffn_down_shexp.weight
create_tensor: loading tensor blk.34.attn_norm.weight
create_tensor: loading tensor blk.34.post_attention_norm.weight
create_tensor: loading tensor blk.34.attn_qkv.weight
create_tensor: loading tensor blk.34.attn_gate.weight
create_tensor: loading tensor blk.34.ssm_conv1d.weight
create_tensor: loading tensor blk.34.ssm_dt.bias
create_tensor: loading tensor blk.34.ssm_a
create_tensor: loading tensor blk.34.ssm_ba.weight
create_tensor: loading tensor blk.34.ssm_norm.weight
create_tensor: loading tensor blk.34.ssm_out.weight
create_tensor: loading tensor blk.34.ffn_gate_inp.weight
tensor blk.34.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_down_exps.weight
tensor blk.34.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_gate_exps.weight
tensor blk.34.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.34.ffn_up_exps.weight
create_tensor: loading tensor blk.34.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.34.ffn_gate_shexp.weight
create_tensor: loading tensor blk.34.ffn_up_shexp.weight
create_tensor: loading tensor blk.34.ffn_down_shexp.weight
create_tensor: loading tensor blk.35.attn_norm.weight
create_tensor: loading tensor blk.35.post_attention_norm.weight
create_tensor: loading tensor blk.35.attn_q.weight
create_tensor: loading tensor blk.35.attn_k.weight
create_tensor: loading tensor blk.35.attn_v.weight
create_tensor: loading tensor blk.35.attn_output.weight
create_tensor: loading tensor blk.35.attn_q_norm.weight
create_tensor: loading tensor blk.35.attn_k_norm.weight
create_tensor: loading tensor blk.35.ffn_gate_inp.weight
tensor blk.35.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_down_exps.weight
tensor blk.35.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_gate_exps.weight
tensor blk.35.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.35.ffn_up_exps.weight
create_tensor: loading tensor blk.35.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.35.ffn_gate_shexp.weight
create_tensor: loading tensor blk.35.ffn_up_shexp.weight
create_tensor: loading tensor blk.35.ffn_down_shexp.weight
create_tensor: loading tensor blk.36.attn_norm.weight
create_tensor: loading tensor blk.36.post_attention_norm.weight
create_tensor: loading tensor blk.36.attn_qkv.weight
create_tensor: loading tensor blk.36.attn_gate.weight
create_tensor: loading tensor blk.36.ssm_conv1d.weight
create_tensor: loading tensor blk.36.ssm_dt.bias
create_tensor: loading tensor blk.36.ssm_a
create_tensor: loading tensor blk.36.ssm_ba.weight
create_tensor: loading tensor blk.36.ssm_norm.weight
create_tensor: loading tensor blk.36.ssm_out.weight
create_tensor: loading tensor blk.36.ffn_gate_inp.weight
tensor blk.36.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_down_exps.weight
tensor blk.36.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_gate_exps.weight
tensor blk.36.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.36.ffn_up_exps.weight
create_tensor: loading tensor blk.36.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.36.ffn_gate_shexp.weight
create_tensor: loading tensor blk.36.ffn_up_shexp.weight
create_tensor: loading tensor blk.36.ffn_down_shexp.weight
create_tensor: loading tensor blk.37.attn_norm.weight
create_tensor: loading tensor blk.37.post_attention_norm.weight
create_tensor: loading tensor blk.37.attn_qkv.weight
create_tensor: loading tensor blk.37.attn_gate.weight
create_tensor: loading tensor blk.37.ssm_conv1d.weight
create_tensor: loading tensor blk.37.ssm_dt.bias
create_tensor: loading tensor blk.37.ssm_a
create_tensor: loading tensor blk.37.ssm_ba.weight
create_tensor: loading tensor blk.37.ssm_norm.weight
create_tensor: loading tensor blk.37.ssm_out.weight
create_tensor: loading tensor blk.37.ffn_gate_inp.weight
tensor blk.37.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_down_exps.weight
tensor blk.37.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_gate_exps.weight
tensor blk.37.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.37.ffn_up_exps.weight
create_tensor: loading tensor blk.37.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.37.ffn_gate_shexp.weight
create_tensor: loading tensor blk.37.ffn_up_shexp.weight
create_tensor: loading tensor blk.37.ffn_down_shexp.weight
create_tensor: loading tensor blk.38.attn_norm.weight
create_tensor: loading tensor blk.38.post_attention_norm.weight
create_tensor: loading tensor blk.38.attn_qkv.weight
create_tensor: loading tensor blk.38.attn_gate.weight
create_tensor: loading tensor blk.38.ssm_conv1d.weight
create_tensor: loading tensor blk.38.ssm_dt.bias
create_tensor: loading tensor blk.38.ssm_a
create_tensor: loading tensor blk.38.ssm_ba.weight
create_tensor: loading tensor blk.38.ssm_norm.weight
create_tensor: loading tensor blk.38.ssm_out.weight
create_tensor: loading tensor blk.38.ffn_gate_inp.weight
tensor blk.38.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_down_exps.weight
tensor blk.38.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_gate_exps.weight
tensor blk.38.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.38.ffn_up_exps.weight
create_tensor: loading tensor blk.38.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.38.ffn_gate_shexp.weight
create_tensor: loading tensor blk.38.ffn_up_shexp.weight
create_tensor: loading tensor blk.38.ffn_down_shexp.weight
create_tensor: loading tensor blk.39.attn_norm.weight
create_tensor: loading tensor blk.39.post_attention_norm.weight
create_tensor: loading tensor blk.39.attn_q.weight
create_tensor: loading tensor blk.39.attn_k.weight
create_tensor: loading tensor blk.39.attn_v.weight
create_tensor: loading tensor blk.39.attn_output.weight
create_tensor: loading tensor blk.39.attn_q_norm.weight
create_tensor: loading tensor blk.39.attn_k_norm.weight
create_tensor: loading tensor blk.39.ffn_gate_inp.weight
tensor blk.39.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_down_exps.weight
tensor blk.39.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_gate_exps.weight
tensor blk.39.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.39.ffn_up_exps.weight
create_tensor: loading tensor blk.39.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.39.ffn_gate_shexp.weight
create_tensor: loading tensor blk.39.ffn_up_shexp.weight
create_tensor: loading tensor blk.39.ffn_down_shexp.weight
create_tensor: loading tensor blk.40.attn_norm.weight
create_tensor: loading tensor blk.40.post_attention_norm.weight
create_tensor: loading tensor blk.40.attn_qkv.weight
create_tensor: loading tensor blk.40.attn_gate.weight
create_tensor: loading tensor blk.40.ssm_conv1d.weight
create_tensor: loading tensor blk.40.ssm_dt.bias
create_tensor: loading tensor blk.40.ssm_a
create_tensor: loading tensor blk.40.ssm_ba.weight
create_tensor: loading tensor blk.40.ssm_norm.weight
create_tensor: loading tensor blk.40.ssm_out.weight
create_tensor: loading tensor blk.40.ffn_gate_inp.weight
tensor blk.40.ffn_down_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_down_exps.weight
tensor blk.40.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_gate_exps.weight
tensor blk.40.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.40.ffn_up_exps.weight
create_tensor: loading tensor blk.40.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.40.ffn_gate_shexp.weight
create_tensor: loading tensor blk.40.ffn_up_shexp.weight
create_tensor: loading tensor blk.40.ffn_down_shexp.weight
create_tensor: loading tensor blk.41.attn_norm.weight
create_tensor: loading tensor blk.41.post_attention_norm.weight
create_tensor: loading tensor blk.41.attn_qkv.weight
create_tensor: loading tensor blk.41.attn_gate.weight
create_tensor: loading tensor blk.41.ssm_conv1d.weight
create_tensor: loading tensor blk.41.ssm_dt.bias
create_tensor: loading tensor blk.41.ssm_a
create_tensor: loading tensor blk.41.ssm_ba.weight
create_tensor: loading tensor blk.41.ssm_norm.weight
create_tensor: loading tensor blk.41.ssm_out.weight
create_tensor: loading tensor blk.41.ffn_gate_inp.weight
tensor blk.41.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_down_exps.weight
tensor blk.41.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_gate_exps.weight
tensor blk.41.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.41.ffn_up_exps.weight
create_tensor: loading tensor blk.41.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.41.ffn_gate_shexp.weight
create_tensor: loading tensor blk.41.ffn_up_shexp.weight
create_tensor: loading tensor blk.41.ffn_down_shexp.weight
create_tensor: loading tensor blk.42.attn_norm.weight
create_tensor: loading tensor blk.42.post_attention_norm.weight
create_tensor: loading tensor blk.42.attn_qkv.weight
create_tensor: loading tensor blk.42.attn_gate.weight
create_tensor: loading tensor blk.42.ssm_conv1d.weight
create_tensor: loading tensor blk.42.ssm_dt.bias
create_tensor: loading tensor blk.42.ssm_a
create_tensor: loading tensor blk.42.ssm_ba.weight
create_tensor: loading tensor blk.42.ssm_norm.weight
create_tensor: loading tensor blk.42.ssm_out.weight
create_tensor: loading tensor blk.42.ffn_gate_inp.weight
tensor blk.42.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_down_exps.weight
tensor blk.42.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_gate_exps.weight
tensor blk.42.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.42.ffn_up_exps.weight
create_tensor: loading tensor blk.42.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.42.ffn_gate_shexp.weight
create_tensor: loading tensor blk.42.ffn_up_shexp.weight
create_tensor: loading tensor blk.42.ffn_down_shexp.weight
create_tensor: loading tensor blk.43.attn_norm.weight
create_tensor: loading tensor blk.43.post_attention_norm.weight
create_tensor: loading tensor blk.43.attn_q.weight
create_tensor: loading tensor blk.43.attn_k.weight
create_tensor: loading tensor blk.43.attn_v.weight
create_tensor: loading tensor blk.43.attn_output.weight
create_tensor: loading tensor blk.43.attn_q_norm.weight
create_tensor: loading tensor blk.43.attn_k_norm.weight
create_tensor: loading tensor blk.43.ffn_gate_inp.weight
tensor blk.43.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_down_exps.weight
tensor blk.43.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_gate_exps.weight
tensor blk.43.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.43.ffn_up_exps.weight
create_tensor: loading tensor blk.43.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.43.ffn_gate_shexp.weight
create_tensor: loading tensor blk.43.ffn_up_shexp.weight
create_tensor: loading tensor blk.43.ffn_down_shexp.weight
create_tensor: loading tensor blk.44.attn_norm.weight
create_tensor: loading tensor blk.44.post_attention_norm.weight
create_tensor: loading tensor blk.44.attn_qkv.weight
create_tensor: loading tensor blk.44.attn_gate.weight
create_tensor: loading tensor blk.44.ssm_conv1d.weight
create_tensor: loading tensor blk.44.ssm_dt.bias
create_tensor: loading tensor blk.44.ssm_a
create_tensor: loading tensor blk.44.ssm_ba.weight
create_tensor: loading tensor blk.44.ssm_norm.weight
create_tensor: loading tensor blk.44.ssm_out.weight
create_tensor: loading tensor blk.44.ffn_gate_inp.weight
tensor blk.44.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_down_exps.weight
tensor blk.44.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_gate_exps.weight
tensor blk.44.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.44.ffn_up_exps.weight
create_tensor: loading tensor blk.44.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.44.ffn_gate_shexp.weight
create_tensor: loading tensor blk.44.ffn_up_shexp.weight
create_tensor: loading tensor blk.44.ffn_down_shexp.weight
create_tensor: loading tensor blk.45.attn_norm.weight
create_tensor: loading tensor blk.45.post_attention_norm.weight
create_tensor: loading tensor blk.45.attn_qkv.weight
create_tensor: loading tensor blk.45.attn_gate.weight
create_tensor: loading tensor blk.45.ssm_conv1d.weight
create_tensor: loading tensor blk.45.ssm_dt.bias
create_tensor: loading tensor blk.45.ssm_a
create_tensor: loading tensor blk.45.ssm_ba.weight
create_tensor: loading tensor blk.45.ssm_norm.weight
create_tensor: loading tensor blk.45.ssm_out.weight
create_tensor: loading tensor blk.45.ffn_gate_inp.weight
tensor blk.45.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_down_exps.weight
tensor blk.45.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_gate_exps.weight
tensor blk.45.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.45.ffn_up_exps.weight
create_tensor: loading tensor blk.45.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.45.ffn_gate_shexp.weight
create_tensor: loading tensor blk.45.ffn_up_shexp.weight
create_tensor: loading tensor blk.45.ffn_down_shexp.weight
create_tensor: loading tensor blk.46.attn_norm.weight
create_tensor: loading tensor blk.46.post_attention_norm.weight
create_tensor: loading tensor blk.46.attn_qkv.weight
create_tensor: loading tensor blk.46.attn_gate.weight
create_tensor: loading tensor blk.46.ssm_conv1d.weight
create_tensor: loading tensor blk.46.ssm_dt.bias
create_tensor: loading tensor blk.46.ssm_a
create_tensor: loading tensor blk.46.ssm_ba.weight
create_tensor: loading tensor blk.46.ssm_norm.weight
create_tensor: loading tensor blk.46.ssm_out.weight
create_tensor: loading tensor blk.46.ffn_gate_inp.weight
tensor blk.46.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_down_exps.weight
tensor blk.46.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_gate_exps.weight
tensor blk.46.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.46.ffn_up_exps.weight
create_tensor: loading tensor blk.46.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.46.ffn_gate_shexp.weight
create_tensor: loading tensor blk.46.ffn_up_shexp.weight
create_tensor: loading tensor blk.46.ffn_down_shexp.weight
create_tensor: loading tensor blk.47.attn_norm.weight
create_tensor: loading tensor blk.47.post_attention_norm.weight
create_tensor: loading tensor blk.47.attn_q.weight
create_tensor: loading tensor blk.47.attn_k.weight
create_tensor: loading tensor blk.47.attn_v.weight
create_tensor: loading tensor blk.47.attn_output.weight
create_tensor: loading tensor blk.47.attn_q_norm.weight
create_tensor: loading tensor blk.47.attn_k_norm.weight
create_tensor: loading tensor blk.47.ffn_gate_inp.weight
tensor blk.47.ffn_down_exps.weight (420 MiB q6_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_down_exps.weight
tensor blk.47.ffn_gate_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_gate_exps.weight
tensor blk.47.ffn_up_exps.weight (352 MiB q5_K) buffer type overridden to ROCm_Host
create_tensor: loading tensor blk.47.ffn_up_exps.weight
create_tensor: loading tensor blk.47.ffn_gate_inp_shexp.weight
create_tensor: loading tensor blk.47.ffn_gate_shexp.weight
create_tensor: loading tensor blk.47.ffn_up_shexp.weight
create_tensor: loading tensor blk.47.ffn_down_shexp.weight
done_getting_tensors: tensor 'token_embd.weight' (q5_K) (and 74 others) cannot be used with preferred buffer type ROCm_Host, using CPU instead
load_tensors: offloading output layer to GPU
load_tensors: offloading 47 repeating layers to GPU
load_tensors: offloaded 49/49 layers to GPU
load_tensors: CPU model buffer size = 204.02 MiB
load_tensors: ROCm0 model buffer size = 27423.82 MiB
load_tensors: ROCm_Host model buffer size = 26580.82 MiB
load_all_data: no device found for buffer type CPU for async uploads
load_all_data: using async uploads for device ROCm0, buffer type ROCm0, backend ROCm0
..................................................load_all_data: buffer type ROCm_Host is not the default buffer type for device ROCm0 for async uploads
..................................................
common_init_result: added </s> logit bias = -inf
common_init_result: added <|endoftext|> logit bias = -inf
common_init_result: added <|im_end|> logit bias = -inf
common_init_result: added <|fim_pad|> logit bias = -inf
common_init_result: added <|repo_name|> logit bias = -inf
common_init_result: added <|file_sep|> logit bias = -inf
llama_context: constructing llama_context
llama_context: n_seq_max = 1
llama_context: n_ctx = 131072
llama_context: n_ctx_seq = 131072
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = enabled
llama_context: kv_unified = false
llama_context: freq_base = 5000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_seq (131072) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: ROCm_Host output buffer size = 0.58 MiB
llama_kv_cache: layer 0: filtered
llama_kv_cache: layer 1: filtered
llama_kv_cache: layer 2: filtered
llama_kv_cache: layer 3: dev = ROCm0
llama_kv_cache: layer 4: filtered
llama_kv_cache: layer 5: filtered
llama_kv_cache: layer 6: filtered
llama_kv_cache: layer 7: dev = ROCm0
llama_kv_cache: layer 8: filtered
llama_kv_cache: layer 9: filtered
llama_kv_cache: layer 10: filtered
llama_kv_cache: layer 11: dev = ROCm0
llama_kv_cache: layer 12: filtered
llama_kv_cache: layer 13: filtered
llama_kv_cache: layer 14: filtered
llama_kv_cache: layer 15: dev = ROCm0
llama_kv_cache: layer 16: filtered
llama_kv_cache: layer 17: filtered
llama_kv_cache: layer 18: filtered
llama_kv_cache: layer 19: dev = ROCm0
llama_kv_cache: layer 20: filtered
llama_kv_cache: layer 21: filtered
llama_kv_cache: layer 22: filtered
llama_kv_cache: layer 23: dev = ROCm0
llama_kv_cache: layer 24: filtered
llama_kv_cache: layer 25: filtered
llama_kv_cache: layer 26: filtered
llama_kv_cache: layer 27: dev = ROCm0
llama_kv_cache: layer 28: filtered
llama_kv_cache: layer 29: filtered
llama_kv_cache: layer 30: filtered
llama_kv_cache: layer 31: dev = ROCm0
llama_kv_cache: layer 32: filtered
llama_kv_cache: layer 33: filtered
llama_kv_cache: layer 34: filtered
llama_kv_cache: layer 35: dev = ROCm0
llama_kv_cache: layer 36: filtered
llama_kv_cache: layer 37: filtered
llama_kv_cache: layer 38: filtered
llama_kv_cache: layer 39: dev = ROCm0
llama_kv_cache: layer 40: filtered
llama_kv_cache: layer 41: filtered
llama_kv_cache: layer 42: filtered
llama_kv_cache: layer 43: dev = ROCm0
llama_kv_cache: layer 44: filtered
llama_kv_cache: layer 45: filtered
llama_kv_cache: layer 46: filtered
llama_kv_cache: layer 47: dev = ROCm0
llama_kv_cache: ROCm0 KV buffer size = 3072.00 MiB
llama_kv_cache: size = 3072.00 MiB (131072 cells, 12 layers, 1/1 seqs), K (f16): 1536.00 MiB, V (f16): 1536.00 MiB
llama_memory_recurrent, layer 0: dev = ROCm0
llama_memory_recurrent, layer 1: dev = ROCm0
llama_memory_recurrent, layer 2: dev = ROCm0
llama_memory_recurrent: layer 3: skipped
llama_memory_recurrent, layer 4: dev = ROCm0
llama_memory_recurrent, layer 5: dev = ROCm0
llama_memory_recurrent, layer 6: dev = ROCm0
llama_memory_recurrent: layer 7: skipped
llama_memory_recurrent, layer 8: dev = ROCm0
llama_memory_recurrent, layer 9: dev = ROCm0
llama_memory_recurrent, layer 10: dev = ROCm0
llama_memory_recurrent: layer 11: skipped
llama_memory_recurrent, layer 12: dev = ROCm0
llama_memory_recurrent, layer 13: dev = ROCm0
llama_memory_recurrent, layer 14: dev = ROCm0
llama_memory_recurrent: layer 15: skipped
llama_memory_recurrent, layer 16: dev = ROCm0
llama_memory_recurrent, layer 17: dev = ROCm0
llama_memory_recurrent, layer 18: dev = ROCm0
llama_memory_recurrent: layer 19: skipped
llama_memory_recurrent, layer 20: dev = ROCm0
llama_memory_recurrent, layer 21: dev = ROCm0
llama_memory_recurrent, layer 22: dev = ROCm0
llama_memory_recurrent: layer 23: skipped
llama_memory_recurrent, layer 24: dev = ROCm0
llama_memory_recurrent, layer 25: dev = ROCm0
llama_memory_recurrent, layer 26: dev = ROCm0
llama_memory_recurrent: layer 27: skipped
llama_memory_recurrent, layer 28: dev = ROCm0
llama_memory_recurrent, layer 29: dev = ROCm0
llama_memory_recurrent, layer 30: dev = ROCm0
llama_memory_recurrent: layer 31: skipped
llama_memory_recurrent, layer 32: dev = ROCm0
llama_memory_recurrent, layer 33: dev = ROCm0
llama_memory_recurrent, layer 34: dev = ROCm0
llama_memory_recurrent: layer 35: skipped
llama_memory_recurrent, layer 36: dev = ROCm0
llama_memory_recurrent, layer 37: dev = ROCm0
llama_memory_recurrent, layer 38: dev = ROCm0
llama_memory_recurrent: layer 39: skipped
llama_memory_recurrent, layer 40: dev = ROCm0
llama_memory_recurrent, layer 41: dev = ROCm0
llama_memory_recurrent, layer 42: dev = ROCm0
llama_memory_recurrent: layer 43: skipped
llama_memory_recurrent, layer 44: dev = ROCm0
llama_memory_recurrent, layer 45: dev = ROCm0
llama_memory_recurrent, layer 46: dev = ROCm0
llama_memory_recurrent: layer 47: skipped
llama_memory_recurrent: ROCm0 RS buffer size = 75.38 MiB
llama_memory_recurrent: size = 75.38 MiB ( 1 cells, 48 layers, 1 seqs), R (f32): 3.38 MiB, S (f32): 72.00 MiB
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
sched_reserve: reserving ...
sched_reserve: max_nodes = 26976
sched_reserve: reserving full memory module
sched_reserve: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 1
sched_reserve: resolving fused Gated Delta Net support:
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
sched_reserve: fused Gated Delta Net (autoregressive) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 16, n_seqs = 1, n_outputs = 16
sched_reserve: fused Gated Delta Net (chunked) enabled
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
sched_reserve: ROCm0 compute buffer size = 840.01 MiB
sched_reserve: ROCm_Host compute buffer size = 264.01 MiB
sched_reserve: graph nodes = 5013
sched_reserve: graph splits = 76 (with bs=512), 54 (with bs=1)
sched_reserve: reserve took 274.49 ms, sched copies = 1
set_adapters_lora: adapters = (nil)
adapters_lora_are_same: adapters = (nil)
common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
set_warmup: value = 1
set_warmup: value = 0
srv load_model: initializing slots, n_slots = 1
common_speculative_is_compat: the target context does not support partial sequence removal
srv load_model: speculative decoding not supported by this context
slot load_model: id 0 | task -1 | new slot, n_ctx = 131072
slot reset: id 0 | task -1 |
srv load_model: prompt cache is enabled, size limit: 8192 MiB
srv load_model: use `--cache-ram 0` to disable the prompt cache
srv load_model: for more info see https://github.com/ggml-org/llama.cpp/pull/16391
Using differential autoparser
=== Starting differential analysis ===
Phase 1: Reasoning analysis
Phase 2: Content analysis
Phase 3: Tool call analysis
Phase 4: Argument analysis

--- Reasoning & Content Structure ---
reasoning_mode: NONE
reasoning_start: ''
reasoning_end: ''
content_mode: PLAIN
content_start: ''
content_end: ''
--- Tool Call Structure ---
tool_mode: TAG_WITH_TAGGED
supports_tools: true
supports_parallel_calls: true
tool_section_start: ''
tool_section_end: ''
per_call_start: '<tool_call>
'
per_call_end: '</tool_call>'
func_name_prefix: '<function='
func_name_suffix: '>
'
func_close: '</function>
'
python_dict_format: false
arg_name_prefix: '<parameter='
arg_name_suffix: '>
'
arg_value_prefix: ''
arg_value_suffix: '</parameter>
'
name_field: 'name'
args_field: 'arguments'
id_field: ''
gen_id_field: ''
parameter_order: ''
=== Differential analysis complete ===
init: chat template, example_format: '<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
'
Using differential autoparser
=== Starting differential analysis ===
Phase 1: Reasoning analysis
Phase 2: Content analysis
Phase 3: Tool call analysis
Phase 4: Argument analysis

--- Reasoning & Content Structure ---
reasoning_mode: NONE
reasoning_start: ''
reasoning_end: ''
content_mode: PLAIN
content_start: ''
content_end: ''
--- Tool Call Structure ---
tool_mode: TAG_WITH_TAGGED
supports_tools: true
supports_parallel_calls: true
tool_section_start: ''
tool_section_end: ''
per_call_start: '<tool_call>
'
per_call_end: '</tool_call>'
func_name_prefix: '<function='
func_name_suffix: '>
'
func_close: '</function>
'
python_dict_format: false
arg_name_prefix: '<parameter='
arg_name_suffix: '>
'
arg_value_prefix: ''
arg_value_suffix: '</parameter>
'
name_field: 'name'
args_field: 'arguments'
id_field: ''
gen_id_field: ''
parameter_order: ''
=== Differential analysis complete ===
srv init: init: chat template, thinking = 0
main: model loaded
main: server is listening on http://0.0.0.0:8001
main: starting the main loop...
que start_loop: processing new tasks
que start_loop: update slots
srv update_slots: all slots are idle
que start_loop: waiting for new tasks