comparison gcc/matrix-reorg.c @ 55:77e2b8dfacca gcc-4.4.5

update it from 4.4.3 to 4.5.0
author ryoma <e075725@ie.u-ryukyu.ac.jp>
date Fri, 12 Feb 2010 23:39:51 +0900
parents a06113de4d67
children b7f97abdc517
comparison
equal deleted inserted replaced
52:c156f1bd5cd9 55:77e2b8dfacca
1 /* Matrix layout transformations. 1 /* Matrix layout transformations.
2 Copyright (C) 2006, 2007, 2008, 2009 Free Software Foundation, Inc. 2 Copyright (C) 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
3 Contributed by Razya Ladelsky <razya@il.ibm.com> 3 Contributed by Razya Ladelsky <razya@il.ibm.com>
4 Originally written by Revital Eres and Mustafa Hagog. 4 Originally written by Revital Eres and Mustafa Hagog.
5 5
6 This file is part of GCC. 6 This file is part of GCC.
7 7
8 GCC is free software; you can redistribute it and/or modify it under 8 GCC is free software; you can redistribute it and/or modify it under
9 the terms of the GNU General Public License as published by the Free 9 the terms of the GNU General Public License as published by the Free
10 Software Foundation; either version 3, or (at your option) any later 10 Software Foundation; either version 3, or (at your option) any later
18 You should have received a copy of the GNU General Public License 18 You should have received a copy of the GNU General Public License
19 along with GCC; see the file COPYING3. If not see 19 along with GCC; see the file COPYING3. If not see
20 <http://www.gnu.org/licenses/>. */ 20 <http://www.gnu.org/licenses/>. */
21 21
22 /* 22 /*
23 Matrix flattening optimization tries to replace a N-dimensional 23 Matrix flattening optimization tries to replace a N-dimensional
24 matrix with its equivalent M-dimensional matrix, where M < N. 24 matrix with its equivalent M-dimensional matrix, where M < N.
25 This first implementation focuses on global matrices defined dynamically. 25 This first implementation focuses on global matrices defined dynamically.
26 26
27 When N==1, we actually flatten the whole matrix. 27 When N==1, we actually flatten the whole matrix.
28 For instance consider a two-dimensional array a [dim1] [dim2]. 28 For instance consider a two-dimensional array a [dim1] [dim2].
41 41
42 The two main phases of the optimization are the analysis 42 The two main phases of the optimization are the analysis
43 and transformation. 43 and transformation.
44 The driver of the optimization is matrix_reorg (). 44 The driver of the optimization is matrix_reorg ().
45 45
46 46
47 47
48 Analysis phase: 48 Analysis phase:
49 =============== 49 ===============
50 50
51 We'll number the dimensions outside-in, meaning the most external 51 We'll number the dimensions outside-in, meaning the most external
52 is 0, then 1, and so on. 52 is 0, then 1, and so on.
53 The analysis part of the optimization determines K, the escape 53 The analysis part of the optimization determines K, the escape
54 level of a N-dimensional matrix (K <= N), that allows flattening of 54 level of a N-dimensional matrix (K <= N), that allows flattening of
55 the external dimensions 0,1,..., K-1. Escape level 0 means that the 55 the external dimensions 0,1,..., K-1. Escape level 0 means that the
56 whole matrix escapes and no flattening is possible. 56 whole matrix escapes and no flattening is possible.
57 57
58 The analysis part is implemented in analyze_matrix_allocation_site() 58 The analysis part is implemented in analyze_matrix_allocation_site()
59 and analyze_matrix_accesses(). 59 and analyze_matrix_accesses().
60 60
61 Transformation phase: 61 Transformation phase:
62 ===================== 62 =====================
63 In this phase we define the new flattened matrices that replace the 63 In this phase we define the new flattened matrices that replace the
64 original matrices in the code. 64 original matrices in the code.
65 Implemented in transform_allocation_sites(), 65 Implemented in transform_allocation_sites(),
66 transform_access_sites(). 66 transform_access_sites().
67 67
68 Matrix Transposing 68 Matrix Transposing
69 ================== 69 ==================
70 The idea of Matrix Transposing is organizing the matrix in a different 70 The idea of Matrix Transposing is organizing the matrix in a different
71 layout such that the dimensions are reordered. 71 layout such that the dimensions are reordered.
72 This could produce better cache behavior in some cases. 72 This could produce better cache behavior in some cases.
73 73
74 For example, lets look at the matrix accesses in the following loop: 74 For example, lets look at the matrix accesses in the following loop:
75 75
76 for (i=0; i<N; i++) 76 for (i=0; i<N; i++)
77 for (j=0; j<M; j++) 77 for (j=0; j<M; j++)
78 access to a[i][j] 78 access to a[i][j]
79 79
80 This loop can produce good cache behavior because the elements of 80 This loop can produce good cache behavior because the elements of
81 the inner dimension are accessed sequentially. 81 the inner dimension are accessed sequentially.
82 82
83 However, if the accesses of the matrix were of the following form: 83 However, if the accesses of the matrix were of the following form:
84 84
85 for (i=0; i<N; i++) 85 for (i=0; i<N; i++)
86 for (j=0; j<M; j++) 86 for (j=0; j<M; j++)
87 access to a[j][i] 87 access to a[j][i]
88 88
89 In this loop we iterate the columns and not the rows. 89 In this loop we iterate the columns and not the rows.
90 Therefore, replacing the rows and columns 90 Therefore, replacing the rows and columns
91 would have had an organization with better (cache) locality. 91 would have had an organization with better (cache) locality.
92 Replacing the dimensions of the matrix is called matrix transposing. 92 Replacing the dimensions of the matrix is called matrix transposing.
93 93
94 This example, of course, could be enhanced to multiple dimensions matrices 94 This example, of course, could be enhanced to multiple dimensions matrices
95 as well. 95 as well.
96 96
97 Since a program could include all kind of accesses, there is a decision 97 Since a program could include all kind of accesses, there is a decision
98 mechanism, implemented in analyze_transpose(), which implements a 98 mechanism, implemented in analyze_transpose(), which implements a
99 heuristic that tries to determine whether to transpose the matrix or not, 99 heuristic that tries to determine whether to transpose the matrix or not,
100 according to the form of the more dominant accesses. 100 according to the form of the more dominant accesses.
101 This decision is transferred to the flattening mechanism, and whether 101 This decision is transferred to the flattening mechanism, and whether
102 the matrix was transposed or not, the matrix is flattened (if possible). 102 the matrix was transposed or not, the matrix is flattened (if possible).
103 103
104 This decision making is based on profiling information and loop information. 104 This decision making is based on profiling information and loop information.
105 If profiling information is available, decision making mechanism will be 105 If profiling information is available, decision making mechanism will be
106 operated, otherwise the matrix will only be flattened (if possible). 106 operated, otherwise the matrix will only be flattened (if possible).
107 107
108 Both optimizations are described in the paper "Matrix flattening and 108 Both optimizations are described in the paper "Matrix flattening and
109 transposing in GCC" which was presented in GCC summit 2006. 109 transposing in GCC" which was presented in GCC summit 2006.
110 http://www.gccsummit.org/2006/2006-GCC-Summit-Proceedings.pdf. */ 110 http://www.gccsummit.org/2006/2006-GCC-Summit-Proceedings.pdf. */
111 111
112 #include "config.h" 112 #include "config.h"
113 #include "system.h" 113 #include "system.h"
114 #include "coretypes.h" 114 #include "coretypes.h"
115 #include "tm.h" 115 #include "tm.h"
116 #include "tree.h" 116 #include "tree.h"
117 #include "rtl.h" 117 #include "rtl.h"
118 #include "c-tree.h"
119 #include "tree-inline.h" 118 #include "tree-inline.h"
120 #include "tree-flow.h" 119 #include "tree-flow.h"
121 #include "tree-flow-inline.h" 120 #include "tree-flow-inline.h"
122 #include "langhooks.h" 121 #include "langhooks.h"
123 #include "hashtab.h" 122 #include "hashtab.h"
129 #include "cgraph.h" 128 #include "cgraph.h"
130 #include "diagnostic.h" 129 #include "diagnostic.h"
131 #include "timevar.h" 130 #include "timevar.h"
132 #include "params.h" 131 #include "params.h"
133 #include "fibheap.h" 132 #include "fibheap.h"
134 #include "c-common.h"
135 #include "intl.h" 133 #include "intl.h"
136 #include "function.h" 134 #include "function.h"
137 #include "basic-block.h" 135 #include "basic-block.h"
138 #include "cfgloop.h" 136 #include "cfgloop.h"
139 #include "tree-iterator.h" 137 #include "tree-iterator.h"
140 #include "tree-pass.h" 138 #include "tree-pass.h"
141 #include "opts.h" 139 #include "opts.h"
142 #include "tree-data-ref.h" 140 #include "tree-data-ref.h"
143 #include "tree-chrec.h" 141 #include "tree-chrec.h"
144 #include "tree-scalar-evolution.h" 142 #include "tree-scalar-evolution.h"
143 #include "tree-ssa-sccvn.h"
145 144
146 /* We need to collect a lot of data from the original malloc, 145 /* We need to collect a lot of data from the original malloc,
147 particularly as the gimplifier has converted: 146 particularly as the gimplifier has converted:
148 147
149 orig_var = (struct_type *) malloc (x * sizeof (struct_type *)); 148 orig_var = (struct_type *) malloc (x * sizeof (struct_type *));
243 242
244 typedef struct access_site_info *access_site_info_p; 243 typedef struct access_site_info *access_site_info_p;
245 DEF_VEC_P (access_site_info_p); 244 DEF_VEC_P (access_site_info_p);
246 DEF_VEC_ALLOC_P (access_site_info_p, heap); 245 DEF_VEC_ALLOC_P (access_site_info_p, heap);
247 246
247 /* Calls to free when flattening a matrix. */
248
249 struct free_info
250 {
251 gimple stmt;
252 tree func;
253 };
254
248 /* Information about matrix to flatten. */ 255 /* Information about matrix to flatten. */
249 struct matrix_info 256 struct matrix_info
250 { 257 {
251 /* Decl tree of this matrix. */ 258 /* Decl tree of this matrix. */
252 tree decl; 259 tree decl;
275 /* The location of the allocation sites (they must be in one 282 /* The location of the allocation sites (they must be in one
276 function). */ 283 function). */
277 tree allocation_function_decl; 284 tree allocation_function_decl;
278 285
279 /* The calls to free for each level of indirection. */ 286 /* The calls to free for each level of indirection. */
280 struct free_info 287 struct free_info *free_stmts;
281 {
282 gimple stmt;
283 tree func;
284 } *free_stmts;
285 288
286 /* An array which holds for each dimension its size. where 289 /* An array which holds for each dimension its size. where
287 dimension 0 is the outer most (one that contains all the others). 290 dimension 0 is the outer most (one that contains all the others).
288 */ 291 */
289 tree *dimension_size; 292 tree *dimension_size;
290 293
291 /* An array which holds for each dimension it's original size 294 /* An array which holds for each dimension it's original size
292 (before transposing and flattening take place). */ 295 (before transposing and flattening take place). */
293 tree *dimension_size_orig; 296 tree *dimension_size_orig;
294 297
295 /* An array which holds for each dimension the size of the type of 298 /* An array which holds for each dimension the size of the type of
296 of elements accessed in that level (in bytes). */ 299 of elements accessed in that level (in bytes). */
303 306
304 /* An array of the accesses to be flattened. 307 /* An array of the accesses to be flattened.
305 elements are of type "struct access_site_info *". */ 308 elements are of type "struct access_site_info *". */
306 VEC (access_site_info_p, heap) * access_l; 309 VEC (access_site_info_p, heap) * access_l;
307 310
308 /* A map of how the dimensions will be organized at the end of 311 /* A map of how the dimensions will be organized at the end of
309 the analyses. */ 312 the analyses. */
310 int *dim_map; 313 int *dim_map;
311 }; 314 };
312 315
313 /* In each phi node we want to record the indirection level we have when we 316 /* In each phi node we want to record the indirection level we have when we
403 return true; 406 return true;
404 407
405 return false; 408 return false;
406 } 409 }
407 410
408 /* Return false if STMT may contain a vector expression. 411 /* Return false if STMT may contain a vector expression.
409 In this situation, all matrices should not be flattened. */ 412 In this situation, all matrices should not be flattened. */
410 static bool 413 static bool
411 may_flatten_matrices_1 (gimple stmt) 414 may_flatten_matrices_1 (gimple stmt)
412 { 415 {
413 tree t; 416 tree t;
447 break; 450 break;
448 } 451 }
449 return true; 452 return true;
450 } 453 }
451 454
452 /* Return false if there are hand-written vectors in the program. 455 /* Return false if there are hand-written vectors in the program.
453 We disable the flattening in such a case. */ 456 We disable the flattening in such a case. */
454 static bool 457 static bool
455 may_flatten_matrices (struct cgraph_node *node) 458 may_flatten_matrices (struct cgraph_node *node)
456 { 459 {
457 tree decl; 460 tree decl;
682 default: 685 default:
683 break; 686 break;
684 } 687 }
685 } 688 }
686 689
687 /* Record the access/allocation site information for matrix MI so we can 690 /* Record the access/allocation site information for matrix MI so we can
688 handle it later in transformation. */ 691 handle it later in transformation. */
689 static void 692 static void
690 record_access_alloc_site_info (struct matrix_info *mi, gimple stmt, tree offset, 693 record_access_alloc_site_info (struct matrix_info *mi, gimple stmt, tree offset,
691 tree index, int level, bool is_alloc) 694 tree index, int level, bool is_alloc)
692 { 695 {
742 mark_min_matrix_escape_level (mi, min_malloc_level, stmt); 745 mark_min_matrix_escape_level (mi, min_malloc_level, stmt);
743 } 746 }
744 else 747 else
745 { 748 {
746 mark_min_matrix_escape_level (mi, level, stmt); 749 mark_min_matrix_escape_level (mi, level, stmt);
747 /* cannot be that (level == min_malloc_level) 750 /* cannot be that (level == min_malloc_level)
748 we would have returned earlier. */ 751 we would have returned earlier. */
749 return; 752 return;
750 } 753 }
751 } 754 }
752 755
780 there is one and only one malloc site that sets this variable. When 783 there is one and only one malloc site that sets this variable. When
781 we are performing the flattening we generate a new variable that 784 we are performing the flattening we generate a new variable that
782 will hold the size for each dimension; each malloc that allocates a 785 will hold the size for each dimension; each malloc that allocates a
783 dimension has the size parameter; we use that parameter to 786 dimension has the size parameter; we use that parameter to
784 initialize the dimension size variable so we can use it later in 787 initialize the dimension size variable so we can use it later in
785 the address calculations. LEVEL is the dimension we're inspecting. 788 the address calculations. LEVEL is the dimension we're inspecting.
786 Return if STMT is related to an allocation site. */ 789 Return if STMT is related to an allocation site. */
787 790
788 static void 791 static void
789 analyze_matrix_allocation_site (struct matrix_info *mi, gimple stmt, 792 analyze_matrix_allocation_site (struct matrix_info *mi, gimple stmt,
790 int level, sbitmap visited) 793 int level, sbitmap visited)
816 return; 819 return;
817 } 820 }
818 else 821 else
819 { 822 {
820 tree malloc_fn_decl; 823 tree malloc_fn_decl;
821 const char *malloc_fname;
822 824
823 malloc_fn_decl = gimple_call_fndecl (stmt); 825 malloc_fn_decl = gimple_call_fndecl (stmt);
824 if (malloc_fn_decl == NULL_TREE) 826 if (malloc_fn_decl == NULL_TREE)
825 { 827 {
826 mark_min_matrix_escape_level (mi, level, stmt); 828 mark_min_matrix_escape_level (mi, level, stmt);
827 return; 829 return;
828 } 830 }
829 malloc_fname = IDENTIFIER_POINTER (DECL_NAME (malloc_fn_decl));
830 if (DECL_FUNCTION_CODE (malloc_fn_decl) != BUILT_IN_MALLOC) 831 if (DECL_FUNCTION_CODE (malloc_fn_decl) != BUILT_IN_MALLOC)
831 { 832 {
832 if (dump_file) 833 if (dump_file)
833 fprintf (dump_file, 834 fprintf (dump_file,
834 "Matrix %s is an argument to function %s\n", 835 "Matrix %s is an argument to function %s\n",
835 get_name (mi->decl), get_name (malloc_fn_decl)); 836 get_name (mi->decl), get_name (malloc_fn_decl));
836 mark_min_matrix_escape_level (mi, level, stmt); 837 mark_min_matrix_escape_level (mi, level, stmt);
837 return; 838 return;
838 } 839 }
839 } 840 }
840 /* This is a call to malloc of level 'level'. 841 /* This is a call to malloc of level 'level'.
841 mi->max_malloced_level-1 == level means that we've 842 mi->max_malloced_level-1 == level means that we've
842 seen a malloc statement of level 'level' before. 843 seen a malloc statement of level 'level' before.
843 If the statement is not the same one that we've 844 If the statement is not the same one that we've
844 seen before, then there's another malloc statement 845 seen before, then there's another malloc statement
845 for the same level, which means that we need to mark 846 for the same level, which means that we need to mark
846 it escaping. */ 847 it escaping. */
847 if (mi->malloc_for_level 848 if (mi->malloc_for_level
848 && mi->max_malloced_level-1 == level 849 && mi->max_malloced_level-1 == level
849 && mi->malloc_for_level[level] != stmt) 850 && mi->malloc_for_level[level] != stmt)
850 { 851 {
859 statement so be in the safe side and mark it as escaping. */ 860 statement so be in the safe side and mark it as escaping. */
860 mark_min_matrix_escape_level (mi, level, stmt); 861 mark_min_matrix_escape_level (mi, level, stmt);
861 } 862 }
862 863
863 /* The transposing decision making. 864 /* The transposing decision making.
864 In order to to calculate the profitability of transposing, we collect two 865 In order to to calculate the profitability of transposing, we collect two
865 types of information regarding the accesses: 866 types of information regarding the accesses:
866 1. profiling information used to express the hotness of an access, that 867 1. profiling information used to express the hotness of an access, that
867 is how often the matrix is accessed by this access site (count of the 868 is how often the matrix is accessed by this access site (count of the
868 access site). 869 access site).
869 2. which dimension in the access site is iterated by the inner 870 2. which dimension in the access site is iterated by the inner
870 most loop containing this access. 871 most loop containing this access.
871 872
872 The matrix will have a calculated value of weighted hotness for each 873 The matrix will have a calculated value of weighted hotness for each
873 dimension. 874 dimension.
874 Intuitively the hotness level of a dimension is a function of how 875 Intuitively the hotness level of a dimension is a function of how
875 many times it was the most frequently accessed dimension in the 876 many times it was the most frequently accessed dimension in the
876 highly executed access sites of this matrix. 877 highly executed access sites of this matrix.
877 878
878 As computed by following equation: 879 As computed by following equation:
879 m n 880 m n
880 __ __ 881 __ __
881 \ \ dim_hot_level[i] += 882 \ \ dim_hot_level[i] +=
882 /_ /_ 883 /_ /_
883 j i 884 j i
884 acc[j]->dim[i]->iter_by_inner_loop * count(j) 885 acc[j]->dim[i]->iter_by_inner_loop * count(j)
885 886
886 Where n is the number of dims and m is the number of the matrix 887 Where n is the number of dims and m is the number of the matrix
887 access sites. acc[j]->dim[i]->iter_by_inner_loop is 1 if acc[j] 888 access sites. acc[j]->dim[i]->iter_by_inner_loop is 1 if acc[j]
888 iterates over dim[i] in innermost loop, and is 0 otherwise. 889 iterates over dim[i] in innermost loop, and is 0 otherwise.
952 VEC_free (access_site_info_p, heap, mi->access_l); 953 VEC_free (access_site_info_p, heap, mi->access_l);
953 954
954 return 1; 955 return 1;
955 } 956 }
956 957
957 /* Find the index which defines the OFFSET from base. 958 /* Find the index which defines the OFFSET from base.
958 We walk from use to def until we find how the offset was defined. */ 959 We walk from use to def until we find how the offset was defined. */
959 static tree 960 static tree
960 get_index_from_offset (tree offset, gimple def_stmt) 961 get_index_from_offset (tree offset, gimple def_stmt)
961 { 962 {
962 tree op1, op2, index; 963 tree op1, op2, index;
1038 else if (mi->dimension_type_size[l] != type_size) 1039 else if (mi->dimension_type_size[l] != type_size)
1039 mark_min_matrix_escape_level (mi, l, stmt); 1040 mark_min_matrix_escape_level (mi, l, stmt);
1040 } 1041 }
1041 } 1042 }
1042 1043
1043 /* USE_STMT represents a GIMPLE_CALL, where one of the arguments is the 1044 /* USE_STMT represents a GIMPLE_CALL, where one of the arguments is the
1044 ssa var that we want to check because it came from some use of matrix 1045 ssa var that we want to check because it came from some use of matrix
1045 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so 1046 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so
1046 far. */ 1047 far. */
1047 1048
1048 static int 1049 static int
1049 analyze_accesses_for_call_stmt (struct matrix_info *mi, tree ssa_var, 1050 analyze_accesses_for_call_stmt (struct matrix_info *mi, tree ssa_var,
1050 gimple use_stmt, int current_indirect_level) 1051 gimple use_stmt, int current_indirect_level)
1117 } 1118 }
1118 } 1119 }
1119 return current_indirect_level; 1120 return current_indirect_level;
1120 } 1121 }
1121 1122
1122 /* USE_STMT represents a phi node of the ssa var that we want to 1123 /* USE_STMT represents a phi node of the ssa var that we want to
1123 check because it came from some use of matrix 1124 check because it came from some use of matrix
1124 MI. 1125 MI.
1125 We check all the escaping levels that get to the PHI node 1126 We check all the escaping levels that get to the PHI node
1126 and make sure they are all the same escaping; 1127 and make sure they are all the same escaping;
1127 if not (which is rare) we let the escaping level be the 1128 if not (which is rare) we let the escaping level be the
1128 minimum level that gets into that PHI because starting from 1129 minimum level that gets into that PHI because starting from
1129 that level we cannot expect the behavior of the indirections. 1130 that level we cannot expect the behavior of the indirections.
1130 CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */ 1131 CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */
1131 1132
1132 static void 1133 static void
1133 analyze_accesses_for_phi_node (struct matrix_info *mi, gimple use_stmt, 1134 analyze_accesses_for_phi_node (struct matrix_info *mi, gimple use_stmt,
1134 int current_indirect_level, sbitmap visited, 1135 int current_indirect_level, sbitmap visited,
1181 record_accesses); 1182 record_accesses);
1182 RESET_BIT (visited, SSA_NAME_VERSION (PHI_RESULT (use_stmt))); 1183 RESET_BIT (visited, SSA_NAME_VERSION (PHI_RESULT (use_stmt)));
1183 } 1184 }
1184 } 1185 }
1185 1186
1186 /* USE_STMT represents an assign statement (the rhs or lhs include 1187 /* USE_STMT represents an assign statement (the rhs or lhs include
1187 the ssa var that we want to check because it came from some use of matrix 1188 the ssa var that we want to check because it came from some use of matrix
1188 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */ 1189 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */
1189 1190
1190 static int 1191 static int
1191 analyze_accesses_for_assign_stmt (struct matrix_info *mi, tree ssa_var, 1192 analyze_accesses_for_assign_stmt (struct matrix_info *mi, tree ssa_var,
1192 gimple use_stmt, int current_indirect_level, 1193 gimple use_stmt, int current_indirect_level,
1241 NULL_TREE, l, true); 1242 NULL_TREE, l, true);
1242 update_type_size (mi, use_stmt, NULL, l); 1243 update_type_size (mi, use_stmt, NULL, l);
1243 } 1244 }
1244 return current_indirect_level; 1245 return current_indirect_level;
1245 } 1246 }
1246 /* Now, check the right-hand-side, to see how the SSA variable 1247 /* Now, check the right-hand-side, to see how the SSA variable
1247 is used. */ 1248 is used. */
1248 if (rhs_acc.var_found) 1249 if (rhs_acc.var_found)
1249 { 1250 {
1250 if (rhs_acc.t_code != INDIRECT_REF 1251 if (rhs_acc.t_code != INDIRECT_REF
1251 && rhs_acc.t_code != POINTER_PLUS_EXPR && rhs_acc.t_code != SSA_NAME) 1252 && rhs_acc.t_code != POINTER_PLUS_EXPR && rhs_acc.t_code != SSA_NAME)
1312 { 1313 {
1313 int l = current_indirect_level; 1314 int l = current_indirect_level;
1314 1315
1315 /* One exception is when we are storing to the matrix 1316 /* One exception is when we are storing to the matrix
1316 variable itself; this is the case of malloc, we must make 1317 variable itself; this is the case of malloc, we must make
1317 sure that it's the one and only one call to malloc so 1318 sure that it's the one and only one call to malloc so
1318 we call analyze_matrix_allocation_site to check 1319 we call analyze_matrix_allocation_site to check
1319 this out. */ 1320 this out. */
1320 if (TREE_CODE (lhs) != VAR_DECL || lhs != mi->decl) 1321 if (TREE_CODE (lhs) != VAR_DECL || lhs != mi->decl)
1321 mark_min_matrix_escape_level (mi, current_indirect_level, 1322 mark_min_matrix_escape_level (mi, current_indirect_level,
1322 use_stmt); 1323 use_stmt);
1323 else 1324 else
1339 } 1340 }
1340 } 1341 }
1341 return current_indirect_level; 1342 return current_indirect_level;
1342 } 1343 }
1343 1344
1344 /* Given a SSA_VAR (coming from a use statement of the matrix MI), 1345 /* Given a SSA_VAR (coming from a use statement of the matrix MI),
1345 follow its uses and level of indirection and find out the minimum 1346 follow its uses and level of indirection and find out the minimum
1346 indirection level it escapes in (the highest dimension) and the maximum 1347 indirection level it escapes in (the highest dimension) and the maximum
1347 level it is accessed in (this will be the actual dimension of the 1348 level it is accessed in (this will be the actual dimension of the
1348 matrix). The information is accumulated in MI. 1349 matrix). The information is accumulated in MI.
1349 We look at the immediate uses, if one escapes we finish; if not, 1350 We look at the immediate uses, if one escapes we finish; if not,
1395 current_indirect_level, last_op, 1396 current_indirect_level, last_op,
1396 visited, record_accesses); 1397 visited, record_accesses);
1397 } 1398 }
1398 } 1399 }
1399 1400
1400 typedef struct 1401 typedef struct
1401 { 1402 {
1402 tree fn; 1403 tree fn;
1403 gimple stmt; 1404 gimple stmt;
1404 } check_var_data; 1405 } check_var_data;
1405 1406
1452 switch (gimple_code (stmt)) 1453 switch (gimple_code (stmt))
1453 { 1454 {
1454 case GIMPLE_ASSIGN: 1455 case GIMPLE_ASSIGN:
1455 code = gimple_assign_rhs_code (stmt); 1456 code = gimple_assign_rhs_code (stmt);
1456 op1 = gimple_assign_rhs1 (stmt); 1457 op1 = gimple_assign_rhs1 (stmt);
1457 1458
1458 switch (code) 1459 switch (code)
1459 { 1460 {
1460 case POINTER_PLUS_EXPR: 1461 case POINTER_PLUS_EXPR:
1461 case PLUS_EXPR: 1462 case PLUS_EXPR:
1462 case MINUS_EXPR: 1463 case MINUS_EXPR:
1567 1568
1568 static int 1569 static int
1569 check_allocation_function (void **slot, void *data ATTRIBUTE_UNUSED) 1570 check_allocation_function (void **slot, void *data ATTRIBUTE_UNUSED)
1570 { 1571 {
1571 int level; 1572 int level;
1572 gimple_stmt_iterator gsi;
1573 basic_block bb_level_0;
1574 struct matrix_info *mi = (struct matrix_info *) *slot; 1573 struct matrix_info *mi = (struct matrix_info *) *slot;
1575 sbitmap visited; 1574 sbitmap visited;
1576 1575
1577 if (!mi->malloc_for_level) 1576 if (!mi->malloc_for_level)
1578 return 1; 1577 return 1;
1589 for (level = 1; level < mi->max_malloced_level; level++) 1588 for (level = 1; level < mi->max_malloced_level; level++)
1590 if (!mi->malloc_for_level[level]) 1589 if (!mi->malloc_for_level[level])
1591 break; 1590 break;
1592 1591
1593 mark_min_matrix_escape_level (mi, level, NULL); 1592 mark_min_matrix_escape_level (mi, level, NULL);
1594
1595 gsi = gsi_for_stmt (mi->malloc_for_level[0]);
1596 bb_level_0 = gsi.bb;
1597 1593
1598 /* Check if the expression of the size passed to malloc could be 1594 /* Check if the expression of the size passed to malloc could be
1599 pre-calculated before the malloc of level 0. */ 1595 pre-calculated before the malloc of level 0. */
1600 for (level = 1; level < mi->min_indirect_level_escape; level++) 1596 for (level = 1; level < mi->min_indirect_level_escape; level++)
1601 { 1597 {
1799 defined by the global variables pointing to the matrices of our interest. 1795 defined by the global variables pointing to the matrices of our interest.
1800 in each use of the SSA we calculate the offset from the base address 1796 in each use of the SSA we calculate the offset from the base address
1801 according to the following equation: 1797 according to the following equation:
1802 1798
1803 a[I1][I2]...[Ik] , where D1..Dk is the length of each dimension and the 1799 a[I1][I2]...[Ik] , where D1..Dk is the length of each dimension and the
1804 escaping level is m <= k, and a' is the new allocated matrix, 1800 escaping level is m <= k, and a' is the new allocated matrix,
1805 will be translated to : 1801 will be translated to :
1806 1802
1807 b[I(m+1)]...[Ik] 1803 b[I(m+1)]...[Ik]
1808 1804
1809 where 1805 where
1810 b = a' + I1*D2...*Dm + I2*D3...Dm + ... + Im 1806 b = a' + I1*D2...*Dm + I2*D3...Dm + ... + Im
1811 */ 1807 */
1812 1808
1813 static int 1809 static int
1814 transform_access_sites (void **slot, void *data ATTRIBUTE_UNUSED) 1810 transform_access_sites (void **slot, void *data ATTRIBUTE_UNUSED)
1863 == INDIRECT_REF); 1859 == INDIRECT_REF);
1864 /* Emit convert statement to convert to type of use. */ 1860 /* Emit convert statement to convert to type of use. */
1865 tmp = create_tmp_var (TREE_TYPE (lhs), "new"); 1861 tmp = create_tmp_var (TREE_TYPE (lhs), "new");
1866 add_referenced_var (tmp); 1862 add_referenced_var (tmp);
1867 rhs = gimple_assign_rhs1 (acc_info->stmt); 1863 rhs = gimple_assign_rhs1 (acc_info->stmt);
1868 new_stmt = gimple_build_assign (tmp, 1864 rhs = fold_convert (TREE_TYPE (tmp),
1869 TREE_OPERAND (rhs, 0)); 1865 TREE_OPERAND (rhs, 0));
1866 new_stmt = gimple_build_assign (tmp, rhs);
1870 tmp = make_ssa_name (tmp, new_stmt); 1867 tmp = make_ssa_name (tmp, new_stmt);
1871 gimple_assign_set_lhs (new_stmt, tmp); 1868 gimple_assign_set_lhs (new_stmt, tmp);
1872 gsi = gsi_for_stmt (acc_info->stmt); 1869 gsi = gsi_for_stmt (acc_info->stmt);
1873 gsi_insert_after (&gsi, new_stmt, GSI_SAME_STMT); 1870 gsi_insert_after (&gsi, new_stmt, GSI_SAME_STMT);
1874 SET_USE (use_p, tmp); 1871 SET_USE (use_p, tmp);
1912 if (!check_transpose_p || mi->is_transposed_p == false) 1909 if (!check_transpose_p || mi->is_transposed_p == false)
1913 tmp1 = offset; 1910 tmp1 = offset;
1914 else 1911 else
1915 { 1912 {
1916 tree new_offset; 1913 tree new_offset;
1917 tree d_type_size, d_type_size_k;
1918
1919 d_type_size = size_int (mi->dimension_type_size[min_escape_l]);
1920 d_type_size_k = size_int (mi->dimension_type_size[k + 1]);
1921 1914
1922 new_offset = 1915 new_offset =
1923 compute_offset (mi->dimension_type_size[min_escape_l], 1916 compute_offset (mi->dimension_type_size[min_escape_l],
1924 mi->dimension_type_size[k + 1], offset); 1917 mi->dimension_type_size[k + 1], offset);
1925 1918
2011 /* Replace multiple mallocs (one for each dimension) to one malloc 2004 /* Replace multiple mallocs (one for each dimension) to one malloc
2012 with the size of DIM1*DIM2*...*DIMN*size_of_element 2005 with the size of DIM1*DIM2*...*DIMN*size_of_element
2013 Make sure that we hold the size in the malloc site inside a 2006 Make sure that we hold the size in the malloc site inside a
2014 new global variable; this way we ensure that the size doesn't 2007 new global variable; this way we ensure that the size doesn't
2015 change and it is accessible from all the other functions that 2008 change and it is accessible from all the other functions that
2016 uses the matrix. Also, the original calls to free are deleted, 2009 uses the matrix. Also, the original calls to free are deleted,
2017 and replaced by a new call to free the flattened matrix. */ 2010 and replaced by a new call to free the flattened matrix. */
2018 2011
2019 static int 2012 static int
2020 transform_allocation_sites (void **slot, void *data ATTRIBUTE_UNUSED) 2013 transform_allocation_sites (void **slot, void *data ATTRIBUTE_UNUSED)
2021 { 2014 {
2407 gate_matrix_reorg (void) 2400 gate_matrix_reorg (void)
2408 { 2401 {
2409 return flag_ipa_matrix_reorg && flag_whole_program; 2402 return flag_ipa_matrix_reorg && flag_whole_program;
2410 } 2403 }
2411 2404
2412 struct simple_ipa_opt_pass pass_ipa_matrix_reorg = 2405 struct simple_ipa_opt_pass pass_ipa_matrix_reorg =
2413 { 2406 {
2414 { 2407 {
2415 SIMPLE_IPA_PASS, 2408 SIMPLE_IPA_PASS,
2416 "matrix-reorg", /* name */ 2409 "matrix-reorg", /* name */
2417 gate_matrix_reorg, /* gate */ 2410 gate_matrix_reorg, /* gate */
2418 matrix_reorg, /* execute */ 2411 matrix_reorg, /* execute */
2419 NULL, /* sub */ 2412 NULL, /* sub */
2420 NULL, /* next */ 2413 NULL, /* next */
2421 0, /* static_pass_number */ 2414 0, /* static_pass_number */
2422 0, /* tv_id */ 2415 TV_NONE, /* tv_id */
2423 0, /* properties_required */ 2416 0, /* properties_required */
2424 PROP_trees, /* properties_provided */ 2417 0, /* properties_provided */
2425 0, /* properties_destroyed */ 2418 0, /* properties_destroyed */
2426 0, /* todo_flags_start */ 2419 0, /* todo_flags_start */
2427 TODO_dump_cgraph | TODO_dump_func /* todo_flags_finish */ 2420 TODO_dump_cgraph | TODO_dump_func /* todo_flags_finish */
2428 } 2421 }
2429 }; 2422 };
2430