

# POWER10 Processor for Regular Techies

#### Abstract:

Nigel will cover the POWER10 processor for big commercial server specialists such as Systems Administrator, Performance Guru, DevOps, Solution Designer, Systems Architects, and similar. Not for Processor Architects/Designers or kernel coders.

We will work through the publicly available information, followed by a little speculation - how the POWER9 generation servers might look, if they were running POWER10 processors a glimpse into the future.

No IBM Confidential information nor making Announcements in this session.

We will not be covering dates, prices or performance rPerf/CPW ratings, GHz or model names.

Relevant for AIX / IBM i / Linux on POWER environments.

# IBM's POWER10 Processor

IBM

# 32<sup>nd</sup> Hot Chips Conference - August 2020



## William Starke

- Distinguished Engineer
- POWER10 Processor Architect



## **Brian Thompto**

- Distinguished Engineer
- POWER10 Processor Core Architect

@ 2020 IBM Corporation

3

# IBM's POWER10 Processor

IBM

# In September + October → YouTube



William Starke

IBM POWER10 Processor OpenPOWER Summit 25 minute version with 1100+ Views https://www.youtube.com/watch?v=27VRdI2BGWg





**Brian Thompto** 

IBM POWER10 Processor: chip capabilities 90 minute version with 180+ Views https://www.youtube.com/watch?v=FMvret3p7qE





# **POWER10 Design Focus**

### Data Plane Bandwidth, Capacity, Composability, Scale

Terabyte/second sockets, Petabyte system memory capacities, 16-socket SMP → Clusters

### **Powerful Enterprise Core**

New Core Architecture, Flexibility, Larger caches, Reduced Latencies

## **End-to-end Security**

Hardware enabled and co-optimized with PowerVM hypervisor

### **Energy Efficiency**

3x improvement over POWER9

#### **Al-Infused Core**

10-20x matrix-math performance / socket compared to POWER9

## **POWER10 Processor Chip**

- Technology and Packaging: 602mm² 7nm Samsung (18B devices)
- 18 layer metal stack, enhanced device
- Single-chip or Dual-chip sockets

- Computational Capabilities:
   Up to 15 SMT8 Cores (2 MB L2 Cache / core) (Up to 120 simultaneous hardware threads)
- Up to 120 MB L3 cache (low latency NUCA mgmt)
- 3x energy efficiency relative to POWER9
- Enterprise thread strength optimizations
- Al and security focused ISA additions
- 2x general, 4x matrix SIMD relative to POWER9
- EA-tagged L1 cache, 4x MMU relative to POWER9

- Open Memory Interface:
   16 x8 at up to 32 GT/s (1 TB/s)
- Technology agnostic support: near/main/storage tiers
- Minimal (< 10ns latency) add vs DDR direct attach

### **PowerAXON Interface:**

- 16 x8 at up to 32 GT/s (1 TB/s)
- SMP interconnect for up to 16 sockets
- OpenCAPI attach for memory, accelerators, I/O
- Integrated clustering (memory semantics)

#### PCIe Gen 5 Interface:

- x64 / DCM at up to 32 GT/s



Die Photo courtesy of Samsung Foundry







## **POWER10 Processor Chip**

- Technology and Packaging: 602mm² 7nm Samsung (18B devices)
- 18 layer metal stack, enhanced device
- Single-chip or Dual-chip sockets

- Computational Capabilities:
   Up to 15 SMT8 Cores (2 MB L2 Cache / core) (Up to 120 simultaneous hardware threads)
- Up to 120 MB L3 cache (low latency NUCA mgmt)
- 3x energy efficiency relative to POWER9
- Enterprise thread strength optimizations
  Al and security focused ISA additions
- 2x general, 4x matrix SIMD relative to POWER9
- EA-tagged L1 cache, 4x MMU relative to POWER9

- Open Memory Interface:
   16 x8 at up to 32 GT/s (1 TB/s)
- Technology agnostic support: near/main/storage tiers
   Minimal (< 10ns latency) add vs DDR direct attach

- PowerAXON Interface:
   16 x8 at up to 32 GT/s (1 TB/s)
- SMP interconnect for up to 16 sockets
- OpenCAPI attach for memory, accelerators, I/O
- Integrated clustering (memory semantics)

PCle Gen 5 Interface:
- x64 / DCM at up to 32 GT/s



Die Photo courtesy of Samsung Foundry



## **POWER10 Processor Chip**

- Technology and Packaging: 602mm² 7nm Samsung (18B devices)
- 18 layer metal stack, enhanced device
- Single-chip or Dual-chip sockets

- Computational Capabilities:
   Up to 15 SMT8 Cores (2 MB L2 Cache / core) (Up to 120 simultaneous hardware threads)
- Up to 120 MB L3 cache (low latency NUCA mgmt)
- 3x energy efficiency relative to POWER9
- Enterprise thread strength optimizations
- Al and security focused ISA additions
- 2x general, 4x matrix SIMD relative to POWER9
- EA-tagged L1 cache, 4x MMU relative to POWER9

- Technology agnostic support: near/main/storage tiers
   Minimal (< 10ns latency) add vs DDR direct attach</li>

#### PowerAXON Interface: **▼**

- 16 x8 at up to 32 GT/s (1 TB/s)₽
- SMP interconnect for up to 16 sockets
- OpenCAPI attach for memory, accelerators, I/O
- Integrated clustering (memory semantics)

#### PCle Gen 5 Interface:

- x64 / DCM at up to 32 GT/s



Die Photo courtesy of Samsung Foundry





















# More Details

• May be more details than you really want!

© 2020 IBM Corporation

23

## **Powerful Core: Enterprise Flexibility**

## **Multiple World-class Software Stacks**

### Resilience and full stack integrity

- PowerVM, KVM
- AIX, IBMi, Linux on Power, OpenShift



### Partition flexibility and security

- · Full-core level LPAR
- Thread-based LPAR scheduling
- **NEW**: With PowerVM Hypervisor



· Hardware assisted container/VM isolation

#### **Hardware Based Workload Balance**



Automatic Thread Resource Balancing 1



**IBM POWER10** 



# POWER10 Servers

Assumes some familiarity with the POWER9 servers

Public POWER10 presentations make references to POWER10 servers with two, four and sixteen sockets

© 2020 IBM Corporation

IBM











IBM

Reminder - this is not an announcement

From public facts on the POWER10 processor chips

- + some assumptions
- + some guesswork

POWER8 and POWER9 had a similar set of servers Well understood and popular with clients Let us assume that this continues . . . for POWER10 This is NOT an IBM statement but Nigel guesswork



# **Powerful Architecture: Al Infused and Future Ready**

### **POWER10 implements Power ISA v3.1**

 v3.1 was the latest open Power ISA contributed to the OpenPOWER Foundation: Royalty free and inclusive of patents for compliant designs Instruction Set Architecture ISA

→ 200+ more instructions



| POWER10 Architecture – Feature Highlights   |                                                                         |                                                                                                                                                                                                                    |
|---------------------------------------------|-------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Prefix Architecture                         | Greatly expanded opcode space, pcrelative addressing, MMA masking, etc. | RISC friendly 8B instructions including modified and new opcode forms.    O 6 310 31                                                                                                                               |
| New Instructions and<br>Datatypes           | New Scalar instructions for control flow, and operation symmetry        | Set Boolean extensions; quad-precision extensions; 128b integer extensions; test LSB by byte; byte reverse GPR; int mul/div modulo; string isolate/clear; pause, wait-reserve.                                     |
|                                             | New SIMD instructions for AI,<br>throughput and data manipulation       | 32-byte load/store vector-pair; MMA (matrix math assist) with reduced precision; bfloat-16 converts; permute variations: extract, insert, splat, blend; compress/expand assist; mask generation; bit manipulation. |
| Advanced System Features and<br>Ease of Use | Storage management                                                      | Persistent memory barrier / flush; store sync; translation extensions.                                                                                                                                             |
|                                             | Debug                                                                   | PMU sampling, filtering; debug watchpoints; tracing.                                                                                                                                                               |
|                                             | Hot/Cold page tracking                                                  | Recording for memory management.                                                                                                                                                                                   |
|                                             | Copy/Paste extensions                                                   | Memory movement; continued on-chip acceleration: Gzip, 842 compression, AES/SHA.                                                                                                                                   |
| Advanced EnergyScale                        | Adaptive power management                                               | Additional performance boost across the operating range.                                                                                                                                                           |
| Security for Cloud                          | Transparent isolation and security for enterprise cloud workloads       | Nested virtualization with KVM on PowerVM; secure containers; main memory encryption; dynamic execution control; secure PMU.                                                                                       |
|                                             |                                                                         | IBM POWER                                                                                                                                                                                                          |





# POWER10 Speeds and Feeds Summary

## Performance Gains

POWER10 with 15 cores = 25% Jump

- + Dual Chip Modules with 30 CPU cores = 100% Jump
- + CPU core thread strength (SMT=8) improvement = 20% Jump
- = POWER10 servers are going to be

# ~3 times faster

With PowerAXON, OMI, PCIe Gen5 scaling and NVMe disks
- No internal bandwidth limitations to slow down the processor

37

© 2020 IBM Corporation

37

# IBM

# POWER10 Speeds and Feeds Summary

#### Given:

same size servers & expected similar electricity use & higher performance

= Superb Green Credentials (transactions per Watt)

## **POWER10 Client Practical Benefits**

### Stronger (faster) Cores and Threads

-Reduce core counts = reduced Software Licence costs

### Larger Virtual Machines per server

- Scale-out: two 10 core VM plus VIOSs → two 28 core VMs plus VIOSs
- Mid-range: two 20 core VM plus VIOSs → two 58 core VMs plus VIOSs
  - Larger VM = faster Apps = less performance issues = reduced SysAdmin

## More Virtual Machines per server

- Scale-out: two 10 core VM plus VIOSs → seven 8 core VMs plus VIOSs
- -Mid-range: two 20 core VM plus VIOSs → seven 16 core VMs plus VIOSs plus spare 4 cores
  - More VM per server = consolidation of servers = less SysAdmin time
  - Reduced rack space/floor space, electricity, network connections

Other arguments require prices and rPerf/CPW for comparisons

© 2020 IBM Corporation

39

# Questions?

# No questions on

- Dates
- Prices
- Performance
- rPerf or CPW ratings.
- GHz
- Model names
- Future Lotto winning numbers!

#### Special notices

This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in

Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.

IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send license inquiries, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice and represent goals and objectives only.

The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied.

All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions. IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice.

IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.

All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.

IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.

Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment.

© 2020 IBM Corporation

41



Special notices (continued)
IBM, IBM (logo), AIX, AIX (logo), EnergyScale, IBM i, i for business (logo), Power, POWER, PowerVM, PowerVM (logo), PowerLinux, PowerLinux (logo), Power Architecture, Power ISA, POWER9, and POWER10 are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries.

A full list of U.S. trademarks owned by IBM may be found at: http://www.ibm.com/legal/copytrade.shtml

Red Hat, OpenShift, and the OpenShift logo are registered trademarks of Red Hat, Inc. in the United States and other countries.

The OpenPOWER word mark and the OpenPOWER logo mark, and related marks, are trademarks and service marks licensed by OpenPOWER

OpenCAPI and the OpenCAPI logo are trademarks of the OpenCAPI Consortium.

Linux is a registered trademark of Linux Torvalds in the United States, other countries or both.

PowerLinux™ uses the registered trademark Linux® pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the Linux® mark on a world-wide basis.

Other company, product and service names may be trademarks or service marks of others.