Octocontrabass wrote:
Ethin wrote:
MDTS is 7, or 128 bytes, which is ridiculously tiny for data transfers -- that's not even a full sector.
MDTS is specified as the number of MPSMIN-sized blocks. QEMU sets MPSMIN to indicate 4kiB, so 128 of those is 512kiB.
Ethin wrote:
ESDTT is 0, suggesting that self-tests aren't supported
Device self-test support is indicated by OACS bit 4. QEMU doesn't set that bit.
Ethin wrote:
Max NVM set ID and endurance group IDs are 0
Support for NVM sets are indicated by CTRATT bit 2. Support for endurance groups are indicated by CTRATT bit 4. QEMU doesn't set either bit.
Ethin wrote:
MAXCMD is 0
MAXCMD is optional for NVMe attached to PCIe.
Ethin wrote:
NN is only 256 and MNAN is 0
That sounds reasonable to me. MNAN is optional. Do you really need more than 256 namespaces?
Ethin wrote:
The problem is that, though I can get a real NVMe disk, I don't know how to use USB passthrough but make it look like a PCIe device so I can test it, and I'd rather not build an entire USB stack just to test an NVMe driver. Talk about overkill...
You can do PCIe passthrough with an external Thunderbolt enclosure. I can't imagine it's a very cost-effective way to test your driver, but it's an option!
Aha, thanks. Yeah, my NVMe controller is getting hung somewhere. Or my code is messing up. I can send the initial identify command, but its getting hung when I ask it for active NSIDs. The code looks like:
Code:
fn process_command(
&mut self,
req: Self::CommandRequest,
) -> Result<Self::Response, Self::Error> {
debug!("Processing command {:?}", req);
debug!("Waiting for controller to be ready...");
loop {
if self.read_csts().get_bit(0) {
break;
}
}
debug!("Writing request");
self.sqs[req.qid].queue_command(req.entry);
if req.qid == 0 {
debug!("Queue is admin queue, writing admin queue doorbell");
self.write_adm_sub_tail_queue_doorbell(self.sqs[req.qid].get_queue_tail());
} else {
debug!("Writing to queue {} doorbell", req.qid);
self.write_sub_tail_doorbell(req.qid, self.sqs[req.qid].get_queue_tail());
}
debug!("Waiting for response");
loop {
debug!("Attempting to acquire interrupt queue read lock");
if let Some(i) = INTRS.try_read() {
debug!("Checking to see if this interrupt was ours");
if i[self.intrspos].1 > 0 {
break;
}
}
debug!("No interrupts received, waiting for more");
hlt();
}
debug!("Disabling interrupts");
disable_interrupts();
debug!("Acquiring interrupt queue write lock");
let mut i = INTRS.write();
debug!("Reading new queue entries");
let mut entries: MiniVec<queues::CompletionQueueEntry> = MiniVec::new();
self.cqs[req.qid].read_new_entries(&mut entries);
debug!(
"Decreasing interrupt count from {} to {}",
i[self.intrspos].1,
i[self.intrspos].1 - 1
);
i[self.intrspos].1 -= 1;
if req.qid == 0 {
debug!("Writing to admin completion queue doorbell");
self.write_adm_comp_head_queue_doorbell(self.cqs[req.qid].get_queue_head());
} else {
debug!("Writing completion queue doorbell for queue {}", req.qid);
self.write_comp_head_doorbell(req.qid, self.cqs[req.qid].get_queue_head());
}
if entries.len() > 1 {
warn!(
"Retrieved {} responses; returning only first",
entries.len()
);
entries.truncate(1);
}
let entry = entries[0];
enable_interrupts();
if entry.status.sc != 0x00 {
Err(entry.status)
} else {
Ok(Response {
qid: req.qid,
entry,
})
}
}
The output looks like:
Code:
[DEBUG] [libk::nvme] Processing command Request { qid: 0, entry: SubmissionQueueEntry { cdw0: 6, nsid: 0, _rsvd: 0, mptr: 0, prps: [2077036544, 0], operands: [1, 0, 0, 0, 0, 0] } }
[DEBUG] [libk::nvme] Waiting for controller to be ready...
[DEBUG] [libk::nvme] Writing request
[DEBUG] [libk::nvme] Queue is admin queue, writing admin queue doorbell
[DEBUG] [libk::nvme] Writing to adm submission tail doorbel at memaddr FEBD5000h: 1h
[DEBUG] [libk::nvme] Waiting for response
[DEBUG] [libk::interrupts] Interrupt received for int 229
[DEBUG] [libk::interrupts] Acquired lock
[DEBUG] [libk::interrupts] Calling func 0
[DEBUG] [libk::nvme] Attempting to acquire interrupt queue read lock
[DEBUG] [libk::nvme] Checking to see if this interrupt was ours
[DEBUG] [libk::nvme] Disabling interrupts
[DEBUG] [libk::nvme] Acquiring interrupt queue write lock
[DEBUG] [libk::nvme] Reading new queue entries
[DEBUG] [libk::nvme] Decreasing interrupt count from 1 to 0
[DEBUG] [libk::nvme] Writing to admin completion queue doorbell
[DEBUG] [libk::nvme] Writing to adm completion head doorbel at memaddr FEBD5004h: 1h
[INFO] [libk::nvme] Vendor ID: 1B36, subsystem vendor ID: 1AF4
[INFO] [libk::nvme] Serial number: 0001
[INFO] [libk::nvme] Model number: QEMU NVMe Ctrl
[INFO] [libk::nvme] Firmware revision: 1.0
[INFO] [libk::nvme] RAB: 40h
[INFO] [libk::nvme] FRU GUID: 0
[INFO] [libk::nvme] NVM capacity total: 0; unallocated NVM capacity: 0
[INFO] [libk::nvme] NVMe qualified name: nqn.2019-08.org.qemu:0001
[INFO] [libk::nvme] MDTS is 7 pages (128 bytes)
[INFO] [libk::nvme] Extended device self-test time is 0 minutes
[INFO] [libk::nvme] Max NVM set identifier is 0
[INFO] [libk::nvme] Max endurance group ID is 0
[INFO] [libk::nvme] SQES is 6
[INFO] [libk::nvme] CQES is 4
[INFO] [libk::nvme] maxcmd is 0
[INFO] [libk::nvme] nn is 256, mnan is 0
[INFO] [libk::nvme] Checking for active namespaces
[DEBUG] [libk::nvme] Processing command Request { qid: 0, entry: SubmissionQueueEntry { cdw0: 6, nsid: 0, _rsvd: 0, mptr: 0, prps: [2046713856, 0], operands: [2, 0, 0, 0, 0, 0] } }
[DEBUG] [libk::nvme] Waiting for controller to be ready...
[DEBUG] [libk::nvme] Writing request
[DEBUG] [libk::nvme] Queue is admin queue, writing admin queue doorbell
[DEBUG] [libk::nvme] Writing to adm submission tail doorbel at memaddr FEBD5000h: 2h
[DEBUG] [libk::nvme] Waiting for response
[DEBUG] [libk::nvme] Attempting to acquire interrupt queue read lock
[DEBUG] [libk::nvme] Checking to see if this interrupt was ours
[DEBUG] [libk::nvme] No interrupts received, waiting for more
Its weird and I'm not really sure what's wrong. Even scoping the locks to force them to auto-release doesn't solve the problem.
Update: I completely forgot the QEMU logs that I'd had generated. But the logs raise more questions than answers:
Code:
[email protected]:pci_nvme_admin_cmd cid 0 sqid 0 opc 0x0 opname 'NVME_ADM_CMD_DELETE_SQ'
[email protected]:pci_nvme_err_invalid_del_sq invalid submission queue deletion, sid=0
[email protected]:pci_nvme_enqueue_req_completion cid 0 cqid 0 status 0x4101
[email protected]:pci_nvme_err_req_status cid 0 nsid 4294967295 status 0x4101 opc 0x0
[email protected]:pci_nvme_irq_msix raising MSI-X IRQ vector 0
[email protected]:apic_deliver_irq dest 255 dest_mode 0 delivery_mode 0 vector 158 trigger_mode 0
[email protected]:apic_report_irq_delivered coalescing 9
[email protected]:apic_report_irq_delivered coalescing 9
[email protected]:apic_report_irq_delivered coalescing 9
[email protected]:apic_report_irq_delivered coalescing 9
[email protected]:apic_report_irq_delivered coalescing 9
[email protected]:apic_report_irq_delivered coalescing 9
[email protected]:apic_report_irq_delivered coalescing 9
[email protected]:apic_report_irq_delivered coalescing 9
The confusing thing is that I'm not sending that opcode at all. My identify command sends the identify opcode, 0x06. I'm passing 0x02 as the CNS value and 0 as the NSID, controller ID, NVM set identifier, and UUID index. (Note: the interrupt value changed because this was a different run and my kernel attempts to assign random IRQ numbers to devices to reduce interrupt collisions.)
On a different note, I managed to get my kernel booting in VirtualBox, but it doesn't seem like it supports MSI-X and I haven't implemented standard MSI.