Demande d'explications sur le concept de réduction de vecteur

**ferhat.adel** · 26/12/2012, 01h18

bonsoir à tous le monde

Dans un exemple du livre cuda by example, j'ai pas compris ce code :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
int i = blockDim.x/2;
while (i != 0) {
    if (cacheIndex < i)
        cache[cacheIndex] += cache[cacheIndex + i];
    __syncthreads();
    i /= 2;
}

j'ai pas compris pourquoi il a besoin de int i = blockDim.x/2;
alors il dit :

The general idea is that each thread will add two of the values in cache[] and
store the result back to cache[].

pourquoi ne pas utiliser int i=thread.x/2
Merci

**gbdivers** · 26/12/2012, 12h32

Merci de faire attention à la présentation de ton message (balises CODE et QUOTE, indentation, ponctuation)

Si, je suppose que c'est le code de 79 ? Avec la figure 5.4, c'est pas clair ? Tu démarre bien à la moitié de la largueur du block

**ferhat.adel** · 26/12/2012, 13h15

bonjour merci pour votre réponse voici le code complet

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
 
__global__ void dot( float *a, float *b, float *c ) {
__shared__ float cache[threadsPerBlock];
int tid = threadIdx.x + blockIdx.x * blockDim.x;
int cacheIndex = threadIdx.x;
float temp = 0;
while (tid < N) {
temp += a[tid] * b[tid];
tid += blockDim.x * gridDim.x;
}
// set the cache values
cache[cacheIndex] = temp;
// synchronize threads in this block
__syncthreads();
// for reductions, threadsPerBlock must be a power of 2
// because of the following code
int i = blockDim.x/2;
while (i != 0) {
if (cacheIndex < i)
cache[cacheIndex] += cache[cacheIndex + i];
__syncthreads();
i /= 2;
}
if (cacheIndex == 0)
c[blockIdx.x] = cache[0];
}

je n'arrive pas à comprendre pourquoi ils utilisent l'instruction int i = blockDim.x/2;
pourtant la réduction de fait pour les threads d'un même block
Je vous remercie d'avance

**gbdivers** · 26/12/2012, 13h42

Sérieusement, ça commence à m'aggacer ton code non correctement indenté. Il faut que tu fasses attention...
Ton code d'origine :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
__global__ void dot( float *a, float *b, float *c ) {
__shared__ float cache[threadsPerBlock];
int tid = threadIdx.x + blockIdx.x * blockDim.x;
int cacheIndex = threadIdx.x;
float temp = 0;
while (tid < N) {
temp += a[tid] * b[tid];
tid += blockDim.x * gridDim.x;
}
// set the cache values
cache[cacheIndex] = temp;
// synchronize threads in this block
__syncthreads();
// for reductions, threadsPerBlock must be a power of 2
// because of the following code
int i = blockDim.x/2;
while (i != 0) {
if (cacheIndex < i)
cache[cacheIndex] += cache[cacheIndex + i];
__syncthreads();
i /= 2;
}
if (cacheIndex == 0)
c[blockIdx.x] = cache[0];
}

Ton code correctement indenté :

Code :

Sélectionner tout - Visualiser dans une fenêtre à part

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
__global__ void dot( float *a, float *b, float *c ) {
    __shared__ float cache[threadsPerBlock];
    int tid = threadIdx.x + blockIdx.x * blockDim.x;
    int cacheIndex = threadIdx.x;
    float temp = 0;
    while (tid < N) {
        temp += a[tid] * b[tid];
        tid += blockDim.x * gridDim.x;
    }
    // set the cache values
    cache[cacheIndex] = temp;
    // synchronize threads in this block
    __syncthreads();
    // for reductions, threadsPerBlock must be a power of 2
    // because of the following code
    int i = blockDim.x/2;
    while (i != 0) {
        if (cacheIndex < i)
            cache[cacheIndex] += cache[cacheIndex + i];
        __syncthreads();
        i /= 2;
    }
    if (cacheIndex == 0)
        c[blockIdx.x] = cache[0];
}

Si tu veux de l'aide, fais un effort.

Regarde la différence entre une réduction "classique" (par exemple http://cs.anu.edu.au/student/comp332...preduction.jpg) et la figure 5.4.
Dans le cas classique, après 1 itération, tu as 1 threads sur 2 qui travaillent, après la 2ème itération, tu as 1 sur 4 qui travaillent, etc. Le code dont tu parles regroupe les résultats des calculs ensemble.
Je te conseille de prendre un papier et un crayon pour tester les valeurs prises par les variables à chaque cycle

**ferhat.adel** · 26/12/2012, 13h47

bonjour merci pour ton aide j'excuse pour le code mal indenté
s'il te plait j'ai compris le principe de réduction mais j'ai pas compris le rôle de cette instruction int i = blockDim.x/2 puisque nous somme dans le même block
Merci

Demande d'explications sur le concept de réduction de vecteur

CUDA

Discussions similaires

Partager

Partager