OSDev.org

The Place to Start for Operating System Developers
It is currently Sat Jun 24, 2017 7:53 pm

All times are UTC - 6 hours




Post new topic Reply to topic  [ 9 posts ] 
Author Message
 Post subject: connecting multiple ethernet 10g link to one processor
PostPosted: Tue Jan 03, 2017 7:45 pm 
Offline
Member
Member

Joined: Wed Nov 18, 2015 3:04 pm
Posts: 232
Location: San Jose San Francisco Bay Area
we are working on a design project with multi-socket system in which we are connecting several 10g ethernet links to 1 CPU and 2nd CPU basically has no connection through hardware design.
There are some concern from select folks whether this cause congestion and debate heated up. IMO, this really generally does not matter due to APIC distributing whatever interrupt is coming from the devices. I was thinking whether there is a case of performance hit in NUMA enabled systems, but then NUMA is largely concerned with the memory locality and has nothing to do with network connectivitiy. This is compared to system which has 1 x 10G link connection to each CPU. Any thoughts on this? Thanks.,

_________________
key takeaway after spending yrs on sw industry: big issue small because everyone jumps on it and fixes it. small issue is big since everyone ignores and it causes catastrophy later. #devilisinthedetails


Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Tue Jan 03, 2017 10:24 pm 
Offline
Member
Member
User avatar

Joined: Sun Dec 25, 2016 1:54 am
Posts: 195
You will want to examine everything associated with DPDK.... the Intel Dataplane Development Kit

They are working with not just 10gig cards but also 40gig cards

http://www.dpdk.org/

here is a paper on 40gig performance on off the shelf hardware...

http://perso.telecom-paristech.fr/~dros ... echrep.pdf

Now to the specifics....

DPDK's threading model is one thread per core where the core is exempted from the OS's scheduler.

Threads are run-to-completion and avoid OS syscalls...

In general practice, a whole core - A WHOLE CORE - is dedicated to *** just reading*** the hardware rings on the NICs connected to it...

Another whole core is dedicated to *** just writing *** to the hardware rings of NICs connected to it...

The DPDK software is very cache aware and the data structures in it are tuned to be cache line aligned.

Now towards your NUMA issues....

DPDK takes into account - via mapping Huge Page memory (also not under OS memory manager control) to physical addresses, finding contiguous pages and ***noting which physical memory is connected to what physical SOCKET****

This allows DPDK to allocate memory to a ring buffer and packet buffer which will be accessed by core 3 on socket 0 and be assured that this memory is physically connected to socket 0.

This reduces substantially the NUMA hit your team members are concerned about.... but it is a software problem... not a hardware one.

ultimately the argument about why your second unconnected processor may lag in performance is a question about just what software you run on it....

For example...

socket 0 cores 0 - 8 -- run networking code and filtering algorithms

socket 1 cores 9 - 15 -- run the OS and a database application which consumes filtered events

the lag is one way from socket 0 (the network) to socket 1 (the database)

If the database has to communicate at wire speed - it would be better to have some of it running on socket 0.

So... please check out DPDK, and go ask your team just what applications this hardware is supposedly designed for..... Although I find it hard to believe the second socket doesn't have it's own PCI-x connections....

Good luck and cheers!

_________________
Plagiarize. Plagiarize. Let not one line escape thine eyes...


Last edited by dchapiesky on Tue Jan 03, 2017 10:31 pm, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Tue Jan 03, 2017 10:28 pm 
Offline
Member
Member
User avatar

Joined: Sun Dec 25, 2016 1:54 am
Posts: 195
ggodw000 wrote:
..... but then NUMA is largely concerned with the memory locality and has nothing to do with network connectivitiy


Please note that most 10gig / 40gig cards directly connect to the CPU Socket and write directly into cache

For a packet to get from Socket 0 to Socket 1 it must traverse from Socket 0 cache to Socket 1 cache via NUMA channels... While this probably doesn't invalidate memory and force a cache flush; it will saturate the inter-socket communications if core 9 on socket 1 is reading h/w rings of a nic on socket 0.....

_________________
Plagiarize. Plagiarize. Let not one line escape thine eyes...


Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Tue Jan 03, 2017 10:53 pm 
Offline
Member
Member

Joined: Tue Mar 04, 2014 5:27 am
Posts: 779
+1 to what dchapiesky said. I've worked for a while on a system like that (with many instances of one 10G NIC = one CPU core; poked around NIC drivers and TCP/IP). If I'm not mistaken, we didn't use hyperthreading (disabled it). And some of our old boxes had issues with PCIE shared between NICs and/or some other devices, making it hard to get and maintain the full speed (cheap(er) off-the-shelf hardware may have nasty surprises). DPDK is a way to start.


Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Tue Jan 03, 2017 11:56 pm 
Offline
Member
Member
User avatar

Joined: Sat Jan 15, 2005 12:00 am
Posts: 7976
Location: At his keyboard!
Hi,

ggodw000 wrote:
we are working on a design project with multi-socket system in which we are connecting several 10g ethernet links to 1 CPU and 2nd CPU basically has no connection through hardware design.
There are some concern from select folks whether this cause congestion and debate heated up. IMO, this really generally does not matter due to APIC distributing whatever interrupt is coming from the devices. I was thinking whether there is a case of performance hit in NUMA enabled systems, but then NUMA is largely concerned with the memory locality and has nothing to do with network connectivitiy. This is compared to system which has 1 x 10G link connection to each CPU. Any thoughts on this? Thanks.,


If you (e.g.) ask the NIC's "TCP offload engine" to send 64 KiB of data; how much communication does the NIC need to do to fetch all the cache lines and how much communication does the NIC need to do to issue a single "MSI write" to send an IRQ at the end (if that IRQ isn't skipped due to IRQ rate limiting or something)? I'm fairly sure you'll find that (for "number of transactions across buses/links") reads/writes to memory are more significant than IRQs by multiple orders of magnitude.

The only case where IRQs might actually matter is the "extremely large number of extremely tiny packets" case, but for this case the application developer has lost the right to expect acceptable performance (for failing to do anything to combine multiple tiny packets into fewer larger packets); not least of all because the majority of the "10 gigabits of available bandwidth on the wire" will be wasted on inter-packet gaps and packet headers (regardless of what OS or NIC does).

For congestion on the link between CPUs; don't forget that even the oldest and slowest version of quickpath is (according to wikipiedia) running at 153.6 Gbit/s and the NIC's worst case maximum of 10 Gbit/s is a relatively slow dribble. In practice, what is likely to be far more important is the software overhead of the OS and libraries (and not hardware) - things like routing and firewalls (and anti-virus), and crusty old socket APIs (based on "synchronous read/write").


Cheers,

Brendan

_________________
For all things; perfection is, and will always remain, impossible to achieve in practice. However; by striving for perfection we create things that are as perfect as practically possible. Let the pursuit of perfection be our guide.


Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Wed Jan 04, 2017 1:02 am 
Offline
Member
Member
User avatar

Joined: Sun Dec 25, 2016 1:54 am
Posts: 195
Brendan wrote:
In practice, what is likely to be far more important is the software overhead of the OS and libraries (and not hardware) - things like routing and firewalls (and anti-virus), and crusty old socket APIs (based on "synchronous read/write").


Very true. In the DPDK they don't even use interrupts - they poll the nic's h/w rings continuously (thus the 1 core for incoming packets and 1 core for outgoing) to achieve 40gig wire speed.

They are currently attempting to address the issue of power cost & idle in virtual machines.. where these cores are eating up empty cycles on a shared box. That is another thread of discussion altogether.

In any case I would love to hear more about the hardware ggodw000 opened with... can we know more details?

_________________
Plagiarize. Plagiarize. Let not one line escape thine eyes...


Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Wed Jan 04, 2017 4:03 pm 
Offline
Member
Member

Joined: Wed Nov 18, 2015 3:04 pm
Posts: 232
Location: San Jose San Francisco Bay Area
Thanks all for inputs. I know it is not simply problem, and can be affected by rather conglomeration of many pieces hardware, kernel, OS, application and all related features. Will probably take some to digest all inputs.
dpcha, the system info is really confidential, wish I could tell more about it, it is soon to be related product not some school project.

_________________
key takeaway after spending yrs on sw industry: big issue small because everyone jumps on it and fixes it. small issue is big since everyone ignores and it causes catastrophy later. #devilisinthedetails


Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Wed Jan 04, 2017 8:31 pm 
Offline
Member
Member
User avatar

Joined: Sun Dec 25, 2016 1:54 am
Posts: 195
ggodw000 wrote:
the system info is really confidential


Glad you are getting paid brother 8) (or sister) (or...)

I love long term projects that you can't tell anyone about...

"How's your project going?" .... "Fine..."

9 months later

"How's your project going?" .... "Fine..."

Cheers and good luck on it.

Seriously.. *everything* DPDK - particularly DPDK PktGen (PacketGenerator) on their github - WIRESPEED packet storm to test your hardware/software combo

_________________
Plagiarize. Plagiarize. Let not one line escape thine eyes...


Top
 Profile  
 
 Post subject: Re: connecting multiple ethernet 10g link to one processor
PostPosted: Wed Jan 04, 2017 10:48 pm 
Offline
Member
Member

Joined: Wed Nov 18, 2015 3:04 pm
Posts: 232
Location: San Jose San Francisco Bay Area
dchapiesky wrote:
You will want to examine everything associated with DPDK.... the Intel Dataplane Development Kit

They are working with not just 10gig cards but also 40gig cards

http://www.dpdk.org/

here is a paper on 40gig performance on off the shelf hardware...

http://perso.telecom-paristech.fr/~dros ... echrep.pdf

Now to the specifics....

DPDK's threading model is one thread per core where the core is exempted from the OS's scheduler.

Threads are run-to-completion and avoid OS syscalls...

In general practice, a whole core - A WHOLE CORE - is dedicated to *** just reading*** the hardware rings on the NICs connected to it...

Another whole core is dedicated to *** just writing *** to the hardware rings of NICs connected to it...

The DPDK software is very cache aware and the data structures in it are tuned to be cache line aligned.

Now towards your NUMA issues....

DPDK takes into account - via mapping Huge Page memory (also not under OS memory manager control) to physical addresses, finding contiguous pages and ***noting which physical memory is connected to what physical SOCKET****

This allows DPDK to allocate memory to a ring buffer and packet buffer which will be accessed by core 3 on socket 0 and be assured that this memory is physically connected to socket 0.

This reduces substantially the NUMA hit your team members are concerned about.... but it is a software problem... not a hardware one.

ultimately the argument about why your second unconnected processor may lag in performance is a question about just what software you run on it....

For example...

socket 0 cores 0 - 8 -- run networking code and filtering algorithms

socket 1 cores 9 - 15 -- run the OS and a database application which consumes filtered events

the lag is one way from socket 0 (the network) to socket 1 (the database)

If the database has to communicate at wire speed - it would be better to have some of it running on socket 0.

So... please check out DPDK, and go ask your team just what applications this hardware is supposedly designed for..... Although I find it hard to believe the second socket doesn't have it's own PCI-x connections....

Good luck and cheers!

Couple of responses:

- 2nd CPU does have a PCIe connections through root ports, but just nothing is connected to it
- generic modern server that could be running any applications, not tailured for specific application.

_________________
key takeaway after spending yrs on sw industry: big issue small because everyone jumps on it and fixes it. small issue is big since everyone ignores and it causes catastrophy later. #devilisinthedetails


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 9 posts ] 

All times are UTC - 6 hours


Who is online

Users browsing this forum: No registered users and 3 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group