gcc: PowerPC Hardware Transactional Memory Built-in Functions
1
1 6.59.23 PowerPC Hardware Transactional Memory Built-in Functions
1 ----------------------------------------------------------------
1
1 GCC provides two interfaces for accessing the Hardware Transactional
1 Memory (HTM) instructions available on some of the PowerPC family of
1 processors (eg, POWER8). The two interfaces come in a low level
1 interface, consisting of built-in functions specific to PowerPC and a
1 higher level interface consisting of inline functions that are common
1 between PowerPC and S/390.
1
1 6.59.23.1 PowerPC HTM Low Level Built-in Functions
1 ..................................................
1
1 The following low level built-in functions are available with '-mhtm' or
1 '-mcpu=CPU' where CPU is 'power8' or later. They all generate the
1 machine instruction that is part of the name.
1
1 The HTM builtins (with the exception of '__builtin_tbegin') return the
1 full 4-bit condition register value set by their associated hardware
1 instruction. The header file 'htmintrin.h' defines some macros that can
1 be used to decipher the return value. The '__builtin_tbegin' builtin
1 returns a simple true or false value depending on whether a transaction
1 was successfully started or not. The arguments of the builtins match
1 exactly the type and order of the associated hardware instruction's
1 operands, except for the '__builtin_tcheck' builtin, which does not take
1 any input arguments. Refer to the ISA manual for a description of each
1 instruction's operands.
1
1 unsigned int __builtin_tbegin (unsigned int)
1 unsigned int __builtin_tend (unsigned int)
1
1 unsigned int __builtin_tabort (unsigned int)
1 unsigned int __builtin_tabortdc (unsigned int, unsigned int, unsigned int)
1 unsigned int __builtin_tabortdci (unsigned int, unsigned int, int)
1 unsigned int __builtin_tabortwc (unsigned int, unsigned int, unsigned int)
1 unsigned int __builtin_tabortwci (unsigned int, unsigned int, int)
1
1 unsigned int __builtin_tcheck (void)
1 unsigned int __builtin_treclaim (unsigned int)
1 unsigned int __builtin_trechkpt (void)
1 unsigned int __builtin_tsr (unsigned int)
1
1 In addition to the above HTM built-ins, we have added built-ins for
1 some common extended mnemonics of the HTM instructions:
1
1 unsigned int __builtin_tendall (void)
1 unsigned int __builtin_tresume (void)
1 unsigned int __builtin_tsuspend (void)
1
1 Note that the semantics of the above HTM builtins are required to mimic
1 the locking semantics used for critical sections. Builtins that are
1 used to create a new transaction or restart a suspended transaction must
1 have lock acquisition like semantics while those builtins that end or
1 suspend a transaction must have lock release like semantics.
1 Specifically, this must mimic lock semantics as specified by C++11, for
1 example: Lock acquisition is as-if an execution of
1 __atomic_exchange_n(&globallock,1,__ATOMIC_ACQUIRE) that returns 0, and
1 lock release is as-if an execution of
1 __atomic_store(&globallock,0,__ATOMIC_RELEASE), with globallock being an
1 implicit implementation-defined lock used for all transactions. The HTM
1 instructions associated with with the builtins inherently provide the
1 correct acquisition and release hardware barriers required. However,
1 the compiler must also be prohibited from moving loads and stores across
1 the builtins in a way that would violate their semantics. This has been
1 accomplished by adding memory barriers to the associated HTM
1 instructions (which is a conservative approach to provide acquire and
1 release semantics). Earlier versions of the compiler did not treat the
1 HTM instructions as memory barriers. A '__TM_FENCE__' macro has been
1 added, which can be used to determine whether the current compiler
1 treats HTM instructions as memory barriers or not. This allows the user
1 to explicitly add memory barriers to their code when using an older
1 version of the compiler.
1
1 The following set of built-in functions are available to gain access to
1 the HTM specific special purpose registers.
1
1 unsigned long __builtin_get_texasr (void)
1 unsigned long __builtin_get_texasru (void)
1 unsigned long __builtin_get_tfhar (void)
1 unsigned long __builtin_get_tfiar (void)
1
1 void __builtin_set_texasr (unsigned long);
1 void __builtin_set_texasru (unsigned long);
1 void __builtin_set_tfhar (unsigned long);
1 void __builtin_set_tfiar (unsigned long);
1
1 Example usage of these low level built-in functions may look like:
1
1 #include <htmintrin.h>
1
1 int num_retries = 10;
1
1 while (1)
1 {
1 if (__builtin_tbegin (0))
1 {
1 /* Transaction State Initiated. */
1 if (is_locked (lock))
1 __builtin_tabort (0);
1 ... transaction code...
1 __builtin_tend (0);
1 break;
1 }
1 else
1 {
1 /* Transaction State Failed. Use locks if the transaction
1 failure is "persistent" or we've tried too many times. */
1 if (num_retries-- <= 0
1 || _TEXASRU_FAILURE_PERSISTENT (__builtin_get_texasru ()))
1 {
1 acquire_lock (lock);
1 ... non transactional fallback path...
1 release_lock (lock);
1 break;
1 }
1 }
1 }
1
1 One final built-in function has been added that returns the value of
1 the 2-bit Transaction State field of the Machine Status Register (MSR)
1 as stored in 'CR0'.
1
1 unsigned long __builtin_ttest (void)
1
1 This built-in can be used to determine the current transaction state
1 using the following code example:
1
1 #include <htmintrin.h>
1
1 unsigned char tx_state = _HTM_STATE (__builtin_ttest ());
1
1 if (tx_state == _HTM_TRANSACTIONAL)
1 {
1 /* Code to use in transactional state. */
1 }
1 else if (tx_state == _HTM_NONTRANSACTIONAL)
1 {
1 /* Code to use in non-transactional state. */
1 }
1 else if (tx_state == _HTM_SUSPENDED)
1 {
1 /* Code to use in transaction suspended state. */
1 }
1
1 6.59.23.2 PowerPC HTM High Level Inline Functions
1 .................................................
1
1 The following high level HTM interface is made available by including
1 '<htmxlintrin.h>' and using '-mhtm' or '-mcpu=CPU' where CPU is 'power8'
1 or later. This interface is common between PowerPC and S/390, allowing
1 users to write one HTM source implementation that can be compiled and
1 executed on either system.
1
1 long __TM_simple_begin (void)
1 long __TM_begin (void* const TM_buff)
1 long __TM_end (void)
1 void __TM_abort (void)
1 void __TM_named_abort (unsigned char const code)
1 void __TM_resume (void)
1 void __TM_suspend (void)
1
1 long __TM_is_user_abort (void* const TM_buff)
1 long __TM_is_named_user_abort (void* const TM_buff, unsigned char *code)
1 long __TM_is_illegal (void* const TM_buff)
1 long __TM_is_footprint_exceeded (void* const TM_buff)
1 long __TM_nesting_depth (void* const TM_buff)
1 long __TM_is_nested_too_deep(void* const TM_buff)
1 long __TM_is_conflict(void* const TM_buff)
1 long __TM_is_failure_persistent(void* const TM_buff)
1 long __TM_failure_address(void* const TM_buff)
1 long long __TM_failure_code(void* const TM_buff)
1
1 Using these common set of HTM inline functions, we can create a more
1 portable version of the HTM example in the previous section that will
1 work on either PowerPC or S/390:
1
1 #include <htmxlintrin.h>
1
1 int num_retries = 10;
1 TM_buff_type TM_buff;
1
1 while (1)
1 {
1 if (__TM_begin (TM_buff) == _HTM_TBEGIN_STARTED)
1 {
1 /* Transaction State Initiated. */
1 if (is_locked (lock))
1 __TM_abort ();
1 ... transaction code...
1 __TM_end ();
1 break;
1 }
1 else
1 {
1 /* Transaction State Failed. Use locks if the transaction
1 failure is "persistent" or we've tried too many times. */
1 if (num_retries-- <= 0
1 || __TM_is_failure_persistent (TM_buff))
1 {
1 acquire_lock (lock);
1 ... non transactional fallback path...
1 release_lock (lock);
1 break;
1 }
1 }
1 }
1