Mercurial > hg > CbC > CbC_gcc
comparison gcc/matrix-reorg.c @ 55:77e2b8dfacca gcc-4.4.5
update it from 4.4.3 to 4.5.0
author | ryoma <e075725@ie.u-ryukyu.ac.jp> |
---|---|
date | Fri, 12 Feb 2010 23:39:51 +0900 |
parents | a06113de4d67 |
children | b7f97abdc517 |
comparison
equal
deleted
inserted
replaced
52:c156f1bd5cd9 | 55:77e2b8dfacca |
---|---|
1 /* Matrix layout transformations. | 1 /* Matrix layout transformations. |
2 Copyright (C) 2006, 2007, 2008, 2009 Free Software Foundation, Inc. | 2 Copyright (C) 2006, 2007, 2008, 2009 Free Software Foundation, Inc. |
3 Contributed by Razya Ladelsky <razya@il.ibm.com> | 3 Contributed by Razya Ladelsky <razya@il.ibm.com> |
4 Originally written by Revital Eres and Mustafa Hagog. | 4 Originally written by Revital Eres and Mustafa Hagog. |
5 | 5 |
6 This file is part of GCC. | 6 This file is part of GCC. |
7 | 7 |
8 GCC is free software; you can redistribute it and/or modify it under | 8 GCC is free software; you can redistribute it and/or modify it under |
9 the terms of the GNU General Public License as published by the Free | 9 the terms of the GNU General Public License as published by the Free |
10 Software Foundation; either version 3, or (at your option) any later | 10 Software Foundation; either version 3, or (at your option) any later |
18 You should have received a copy of the GNU General Public License | 18 You should have received a copy of the GNU General Public License |
19 along with GCC; see the file COPYING3. If not see | 19 along with GCC; see the file COPYING3. If not see |
20 <http://www.gnu.org/licenses/>. */ | 20 <http://www.gnu.org/licenses/>. */ |
21 | 21 |
22 /* | 22 /* |
23 Matrix flattening optimization tries to replace a N-dimensional | 23 Matrix flattening optimization tries to replace a N-dimensional |
24 matrix with its equivalent M-dimensional matrix, where M < N. | 24 matrix with its equivalent M-dimensional matrix, where M < N. |
25 This first implementation focuses on global matrices defined dynamically. | 25 This first implementation focuses on global matrices defined dynamically. |
26 | 26 |
27 When N==1, we actually flatten the whole matrix. | 27 When N==1, we actually flatten the whole matrix. |
28 For instance consider a two-dimensional array a [dim1] [dim2]. | 28 For instance consider a two-dimensional array a [dim1] [dim2]. |
41 | 41 |
42 The two main phases of the optimization are the analysis | 42 The two main phases of the optimization are the analysis |
43 and transformation. | 43 and transformation. |
44 The driver of the optimization is matrix_reorg (). | 44 The driver of the optimization is matrix_reorg (). |
45 | 45 |
46 | 46 |
47 | 47 |
48 Analysis phase: | 48 Analysis phase: |
49 =============== | 49 =============== |
50 | 50 |
51 We'll number the dimensions outside-in, meaning the most external | 51 We'll number the dimensions outside-in, meaning the most external |
52 is 0, then 1, and so on. | 52 is 0, then 1, and so on. |
53 The analysis part of the optimization determines K, the escape | 53 The analysis part of the optimization determines K, the escape |
54 level of a N-dimensional matrix (K <= N), that allows flattening of | 54 level of a N-dimensional matrix (K <= N), that allows flattening of |
55 the external dimensions 0,1,..., K-1. Escape level 0 means that the | 55 the external dimensions 0,1,..., K-1. Escape level 0 means that the |
56 whole matrix escapes and no flattening is possible. | 56 whole matrix escapes and no flattening is possible. |
57 | 57 |
58 The analysis part is implemented in analyze_matrix_allocation_site() | 58 The analysis part is implemented in analyze_matrix_allocation_site() |
59 and analyze_matrix_accesses(). | 59 and analyze_matrix_accesses(). |
60 | 60 |
61 Transformation phase: | 61 Transformation phase: |
62 ===================== | 62 ===================== |
63 In this phase we define the new flattened matrices that replace the | 63 In this phase we define the new flattened matrices that replace the |
64 original matrices in the code. | 64 original matrices in the code. |
65 Implemented in transform_allocation_sites(), | 65 Implemented in transform_allocation_sites(), |
66 transform_access_sites(). | 66 transform_access_sites(). |
67 | 67 |
68 Matrix Transposing | 68 Matrix Transposing |
69 ================== | 69 ================== |
70 The idea of Matrix Transposing is organizing the matrix in a different | 70 The idea of Matrix Transposing is organizing the matrix in a different |
71 layout such that the dimensions are reordered. | 71 layout such that the dimensions are reordered. |
72 This could produce better cache behavior in some cases. | 72 This could produce better cache behavior in some cases. |
73 | 73 |
74 For example, lets look at the matrix accesses in the following loop: | 74 For example, lets look at the matrix accesses in the following loop: |
75 | 75 |
76 for (i=0; i<N; i++) | 76 for (i=0; i<N; i++) |
77 for (j=0; j<M; j++) | 77 for (j=0; j<M; j++) |
78 access to a[i][j] | 78 access to a[i][j] |
79 | 79 |
80 This loop can produce good cache behavior because the elements of | 80 This loop can produce good cache behavior because the elements of |
81 the inner dimension are accessed sequentially. | 81 the inner dimension are accessed sequentially. |
82 | 82 |
83 However, if the accesses of the matrix were of the following form: | 83 However, if the accesses of the matrix were of the following form: |
84 | 84 |
85 for (i=0; i<N; i++) | 85 for (i=0; i<N; i++) |
86 for (j=0; j<M; j++) | 86 for (j=0; j<M; j++) |
87 access to a[j][i] | 87 access to a[j][i] |
88 | 88 |
89 In this loop we iterate the columns and not the rows. | 89 In this loop we iterate the columns and not the rows. |
90 Therefore, replacing the rows and columns | 90 Therefore, replacing the rows and columns |
91 would have had an organization with better (cache) locality. | 91 would have had an organization with better (cache) locality. |
92 Replacing the dimensions of the matrix is called matrix transposing. | 92 Replacing the dimensions of the matrix is called matrix transposing. |
93 | 93 |
94 This example, of course, could be enhanced to multiple dimensions matrices | 94 This example, of course, could be enhanced to multiple dimensions matrices |
95 as well. | 95 as well. |
96 | 96 |
97 Since a program could include all kind of accesses, there is a decision | 97 Since a program could include all kind of accesses, there is a decision |
98 mechanism, implemented in analyze_transpose(), which implements a | 98 mechanism, implemented in analyze_transpose(), which implements a |
99 heuristic that tries to determine whether to transpose the matrix or not, | 99 heuristic that tries to determine whether to transpose the matrix or not, |
100 according to the form of the more dominant accesses. | 100 according to the form of the more dominant accesses. |
101 This decision is transferred to the flattening mechanism, and whether | 101 This decision is transferred to the flattening mechanism, and whether |
102 the matrix was transposed or not, the matrix is flattened (if possible). | 102 the matrix was transposed or not, the matrix is flattened (if possible). |
103 | 103 |
104 This decision making is based on profiling information and loop information. | 104 This decision making is based on profiling information and loop information. |
105 If profiling information is available, decision making mechanism will be | 105 If profiling information is available, decision making mechanism will be |
106 operated, otherwise the matrix will only be flattened (if possible). | 106 operated, otherwise the matrix will only be flattened (if possible). |
107 | 107 |
108 Both optimizations are described in the paper "Matrix flattening and | 108 Both optimizations are described in the paper "Matrix flattening and |
109 transposing in GCC" which was presented in GCC summit 2006. | 109 transposing in GCC" which was presented in GCC summit 2006. |
110 http://www.gccsummit.org/2006/2006-GCC-Summit-Proceedings.pdf. */ | 110 http://www.gccsummit.org/2006/2006-GCC-Summit-Proceedings.pdf. */ |
111 | 111 |
112 #include "config.h" | 112 #include "config.h" |
113 #include "system.h" | 113 #include "system.h" |
114 #include "coretypes.h" | 114 #include "coretypes.h" |
115 #include "tm.h" | 115 #include "tm.h" |
116 #include "tree.h" | 116 #include "tree.h" |
117 #include "rtl.h" | 117 #include "rtl.h" |
118 #include "c-tree.h" | |
119 #include "tree-inline.h" | 118 #include "tree-inline.h" |
120 #include "tree-flow.h" | 119 #include "tree-flow.h" |
121 #include "tree-flow-inline.h" | 120 #include "tree-flow-inline.h" |
122 #include "langhooks.h" | 121 #include "langhooks.h" |
123 #include "hashtab.h" | 122 #include "hashtab.h" |
129 #include "cgraph.h" | 128 #include "cgraph.h" |
130 #include "diagnostic.h" | 129 #include "diagnostic.h" |
131 #include "timevar.h" | 130 #include "timevar.h" |
132 #include "params.h" | 131 #include "params.h" |
133 #include "fibheap.h" | 132 #include "fibheap.h" |
134 #include "c-common.h" | |
135 #include "intl.h" | 133 #include "intl.h" |
136 #include "function.h" | 134 #include "function.h" |
137 #include "basic-block.h" | 135 #include "basic-block.h" |
138 #include "cfgloop.h" | 136 #include "cfgloop.h" |
139 #include "tree-iterator.h" | 137 #include "tree-iterator.h" |
140 #include "tree-pass.h" | 138 #include "tree-pass.h" |
141 #include "opts.h" | 139 #include "opts.h" |
142 #include "tree-data-ref.h" | 140 #include "tree-data-ref.h" |
143 #include "tree-chrec.h" | 141 #include "tree-chrec.h" |
144 #include "tree-scalar-evolution.h" | 142 #include "tree-scalar-evolution.h" |
143 #include "tree-ssa-sccvn.h" | |
145 | 144 |
146 /* We need to collect a lot of data from the original malloc, | 145 /* We need to collect a lot of data from the original malloc, |
147 particularly as the gimplifier has converted: | 146 particularly as the gimplifier has converted: |
148 | 147 |
149 orig_var = (struct_type *) malloc (x * sizeof (struct_type *)); | 148 orig_var = (struct_type *) malloc (x * sizeof (struct_type *)); |
243 | 242 |
244 typedef struct access_site_info *access_site_info_p; | 243 typedef struct access_site_info *access_site_info_p; |
245 DEF_VEC_P (access_site_info_p); | 244 DEF_VEC_P (access_site_info_p); |
246 DEF_VEC_ALLOC_P (access_site_info_p, heap); | 245 DEF_VEC_ALLOC_P (access_site_info_p, heap); |
247 | 246 |
247 /* Calls to free when flattening a matrix. */ | |
248 | |
249 struct free_info | |
250 { | |
251 gimple stmt; | |
252 tree func; | |
253 }; | |
254 | |
248 /* Information about matrix to flatten. */ | 255 /* Information about matrix to flatten. */ |
249 struct matrix_info | 256 struct matrix_info |
250 { | 257 { |
251 /* Decl tree of this matrix. */ | 258 /* Decl tree of this matrix. */ |
252 tree decl; | 259 tree decl; |
275 /* The location of the allocation sites (they must be in one | 282 /* The location of the allocation sites (they must be in one |
276 function). */ | 283 function). */ |
277 tree allocation_function_decl; | 284 tree allocation_function_decl; |
278 | 285 |
279 /* The calls to free for each level of indirection. */ | 286 /* The calls to free for each level of indirection. */ |
280 struct free_info | 287 struct free_info *free_stmts; |
281 { | |
282 gimple stmt; | |
283 tree func; | |
284 } *free_stmts; | |
285 | 288 |
286 /* An array which holds for each dimension its size. where | 289 /* An array which holds for each dimension its size. where |
287 dimension 0 is the outer most (one that contains all the others). | 290 dimension 0 is the outer most (one that contains all the others). |
288 */ | 291 */ |
289 tree *dimension_size; | 292 tree *dimension_size; |
290 | 293 |
291 /* An array which holds for each dimension it's original size | 294 /* An array which holds for each dimension it's original size |
292 (before transposing and flattening take place). */ | 295 (before transposing and flattening take place). */ |
293 tree *dimension_size_orig; | 296 tree *dimension_size_orig; |
294 | 297 |
295 /* An array which holds for each dimension the size of the type of | 298 /* An array which holds for each dimension the size of the type of |
296 of elements accessed in that level (in bytes). */ | 299 of elements accessed in that level (in bytes). */ |
303 | 306 |
304 /* An array of the accesses to be flattened. | 307 /* An array of the accesses to be flattened. |
305 elements are of type "struct access_site_info *". */ | 308 elements are of type "struct access_site_info *". */ |
306 VEC (access_site_info_p, heap) * access_l; | 309 VEC (access_site_info_p, heap) * access_l; |
307 | 310 |
308 /* A map of how the dimensions will be organized at the end of | 311 /* A map of how the dimensions will be organized at the end of |
309 the analyses. */ | 312 the analyses. */ |
310 int *dim_map; | 313 int *dim_map; |
311 }; | 314 }; |
312 | 315 |
313 /* In each phi node we want to record the indirection level we have when we | 316 /* In each phi node we want to record the indirection level we have when we |
403 return true; | 406 return true; |
404 | 407 |
405 return false; | 408 return false; |
406 } | 409 } |
407 | 410 |
408 /* Return false if STMT may contain a vector expression. | 411 /* Return false if STMT may contain a vector expression. |
409 In this situation, all matrices should not be flattened. */ | 412 In this situation, all matrices should not be flattened. */ |
410 static bool | 413 static bool |
411 may_flatten_matrices_1 (gimple stmt) | 414 may_flatten_matrices_1 (gimple stmt) |
412 { | 415 { |
413 tree t; | 416 tree t; |
447 break; | 450 break; |
448 } | 451 } |
449 return true; | 452 return true; |
450 } | 453 } |
451 | 454 |
452 /* Return false if there are hand-written vectors in the program. | 455 /* Return false if there are hand-written vectors in the program. |
453 We disable the flattening in such a case. */ | 456 We disable the flattening in such a case. */ |
454 static bool | 457 static bool |
455 may_flatten_matrices (struct cgraph_node *node) | 458 may_flatten_matrices (struct cgraph_node *node) |
456 { | 459 { |
457 tree decl; | 460 tree decl; |
682 default: | 685 default: |
683 break; | 686 break; |
684 } | 687 } |
685 } | 688 } |
686 | 689 |
687 /* Record the access/allocation site information for matrix MI so we can | 690 /* Record the access/allocation site information for matrix MI so we can |
688 handle it later in transformation. */ | 691 handle it later in transformation. */ |
689 static void | 692 static void |
690 record_access_alloc_site_info (struct matrix_info *mi, gimple stmt, tree offset, | 693 record_access_alloc_site_info (struct matrix_info *mi, gimple stmt, tree offset, |
691 tree index, int level, bool is_alloc) | 694 tree index, int level, bool is_alloc) |
692 { | 695 { |
742 mark_min_matrix_escape_level (mi, min_malloc_level, stmt); | 745 mark_min_matrix_escape_level (mi, min_malloc_level, stmt); |
743 } | 746 } |
744 else | 747 else |
745 { | 748 { |
746 mark_min_matrix_escape_level (mi, level, stmt); | 749 mark_min_matrix_escape_level (mi, level, stmt); |
747 /* cannot be that (level == min_malloc_level) | 750 /* cannot be that (level == min_malloc_level) |
748 we would have returned earlier. */ | 751 we would have returned earlier. */ |
749 return; | 752 return; |
750 } | 753 } |
751 } | 754 } |
752 | 755 |
780 there is one and only one malloc site that sets this variable. When | 783 there is one and only one malloc site that sets this variable. When |
781 we are performing the flattening we generate a new variable that | 784 we are performing the flattening we generate a new variable that |
782 will hold the size for each dimension; each malloc that allocates a | 785 will hold the size for each dimension; each malloc that allocates a |
783 dimension has the size parameter; we use that parameter to | 786 dimension has the size parameter; we use that parameter to |
784 initialize the dimension size variable so we can use it later in | 787 initialize the dimension size variable so we can use it later in |
785 the address calculations. LEVEL is the dimension we're inspecting. | 788 the address calculations. LEVEL is the dimension we're inspecting. |
786 Return if STMT is related to an allocation site. */ | 789 Return if STMT is related to an allocation site. */ |
787 | 790 |
788 static void | 791 static void |
789 analyze_matrix_allocation_site (struct matrix_info *mi, gimple stmt, | 792 analyze_matrix_allocation_site (struct matrix_info *mi, gimple stmt, |
790 int level, sbitmap visited) | 793 int level, sbitmap visited) |
816 return; | 819 return; |
817 } | 820 } |
818 else | 821 else |
819 { | 822 { |
820 tree malloc_fn_decl; | 823 tree malloc_fn_decl; |
821 const char *malloc_fname; | |
822 | 824 |
823 malloc_fn_decl = gimple_call_fndecl (stmt); | 825 malloc_fn_decl = gimple_call_fndecl (stmt); |
824 if (malloc_fn_decl == NULL_TREE) | 826 if (malloc_fn_decl == NULL_TREE) |
825 { | 827 { |
826 mark_min_matrix_escape_level (mi, level, stmt); | 828 mark_min_matrix_escape_level (mi, level, stmt); |
827 return; | 829 return; |
828 } | 830 } |
829 malloc_fname = IDENTIFIER_POINTER (DECL_NAME (malloc_fn_decl)); | |
830 if (DECL_FUNCTION_CODE (malloc_fn_decl) != BUILT_IN_MALLOC) | 831 if (DECL_FUNCTION_CODE (malloc_fn_decl) != BUILT_IN_MALLOC) |
831 { | 832 { |
832 if (dump_file) | 833 if (dump_file) |
833 fprintf (dump_file, | 834 fprintf (dump_file, |
834 "Matrix %s is an argument to function %s\n", | 835 "Matrix %s is an argument to function %s\n", |
835 get_name (mi->decl), get_name (malloc_fn_decl)); | 836 get_name (mi->decl), get_name (malloc_fn_decl)); |
836 mark_min_matrix_escape_level (mi, level, stmt); | 837 mark_min_matrix_escape_level (mi, level, stmt); |
837 return; | 838 return; |
838 } | 839 } |
839 } | 840 } |
840 /* This is a call to malloc of level 'level'. | 841 /* This is a call to malloc of level 'level'. |
841 mi->max_malloced_level-1 == level means that we've | 842 mi->max_malloced_level-1 == level means that we've |
842 seen a malloc statement of level 'level' before. | 843 seen a malloc statement of level 'level' before. |
843 If the statement is not the same one that we've | 844 If the statement is not the same one that we've |
844 seen before, then there's another malloc statement | 845 seen before, then there's another malloc statement |
845 for the same level, which means that we need to mark | 846 for the same level, which means that we need to mark |
846 it escaping. */ | 847 it escaping. */ |
847 if (mi->malloc_for_level | 848 if (mi->malloc_for_level |
848 && mi->max_malloced_level-1 == level | 849 && mi->max_malloced_level-1 == level |
849 && mi->malloc_for_level[level] != stmt) | 850 && mi->malloc_for_level[level] != stmt) |
850 { | 851 { |
859 statement so be in the safe side and mark it as escaping. */ | 860 statement so be in the safe side and mark it as escaping. */ |
860 mark_min_matrix_escape_level (mi, level, stmt); | 861 mark_min_matrix_escape_level (mi, level, stmt); |
861 } | 862 } |
862 | 863 |
863 /* The transposing decision making. | 864 /* The transposing decision making. |
864 In order to to calculate the profitability of transposing, we collect two | 865 In order to to calculate the profitability of transposing, we collect two |
865 types of information regarding the accesses: | 866 types of information regarding the accesses: |
866 1. profiling information used to express the hotness of an access, that | 867 1. profiling information used to express the hotness of an access, that |
867 is how often the matrix is accessed by this access site (count of the | 868 is how often the matrix is accessed by this access site (count of the |
868 access site). | 869 access site). |
869 2. which dimension in the access site is iterated by the inner | 870 2. which dimension in the access site is iterated by the inner |
870 most loop containing this access. | 871 most loop containing this access. |
871 | 872 |
872 The matrix will have a calculated value of weighted hotness for each | 873 The matrix will have a calculated value of weighted hotness for each |
873 dimension. | 874 dimension. |
874 Intuitively the hotness level of a dimension is a function of how | 875 Intuitively the hotness level of a dimension is a function of how |
875 many times it was the most frequently accessed dimension in the | 876 many times it was the most frequently accessed dimension in the |
876 highly executed access sites of this matrix. | 877 highly executed access sites of this matrix. |
877 | 878 |
878 As computed by following equation: | 879 As computed by following equation: |
879 m n | 880 m n |
880 __ __ | 881 __ __ |
881 \ \ dim_hot_level[i] += | 882 \ \ dim_hot_level[i] += |
882 /_ /_ | 883 /_ /_ |
883 j i | 884 j i |
884 acc[j]->dim[i]->iter_by_inner_loop * count(j) | 885 acc[j]->dim[i]->iter_by_inner_loop * count(j) |
885 | 886 |
886 Where n is the number of dims and m is the number of the matrix | 887 Where n is the number of dims and m is the number of the matrix |
887 access sites. acc[j]->dim[i]->iter_by_inner_loop is 1 if acc[j] | 888 access sites. acc[j]->dim[i]->iter_by_inner_loop is 1 if acc[j] |
888 iterates over dim[i] in innermost loop, and is 0 otherwise. | 889 iterates over dim[i] in innermost loop, and is 0 otherwise. |
952 VEC_free (access_site_info_p, heap, mi->access_l); | 953 VEC_free (access_site_info_p, heap, mi->access_l); |
953 | 954 |
954 return 1; | 955 return 1; |
955 } | 956 } |
956 | 957 |
957 /* Find the index which defines the OFFSET from base. | 958 /* Find the index which defines the OFFSET from base. |
958 We walk from use to def until we find how the offset was defined. */ | 959 We walk from use to def until we find how the offset was defined. */ |
959 static tree | 960 static tree |
960 get_index_from_offset (tree offset, gimple def_stmt) | 961 get_index_from_offset (tree offset, gimple def_stmt) |
961 { | 962 { |
962 tree op1, op2, index; | 963 tree op1, op2, index; |
1038 else if (mi->dimension_type_size[l] != type_size) | 1039 else if (mi->dimension_type_size[l] != type_size) |
1039 mark_min_matrix_escape_level (mi, l, stmt); | 1040 mark_min_matrix_escape_level (mi, l, stmt); |
1040 } | 1041 } |
1041 } | 1042 } |
1042 | 1043 |
1043 /* USE_STMT represents a GIMPLE_CALL, where one of the arguments is the | 1044 /* USE_STMT represents a GIMPLE_CALL, where one of the arguments is the |
1044 ssa var that we want to check because it came from some use of matrix | 1045 ssa var that we want to check because it came from some use of matrix |
1045 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so | 1046 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so |
1046 far. */ | 1047 far. */ |
1047 | 1048 |
1048 static int | 1049 static int |
1049 analyze_accesses_for_call_stmt (struct matrix_info *mi, tree ssa_var, | 1050 analyze_accesses_for_call_stmt (struct matrix_info *mi, tree ssa_var, |
1050 gimple use_stmt, int current_indirect_level) | 1051 gimple use_stmt, int current_indirect_level) |
1117 } | 1118 } |
1118 } | 1119 } |
1119 return current_indirect_level; | 1120 return current_indirect_level; |
1120 } | 1121 } |
1121 | 1122 |
1122 /* USE_STMT represents a phi node of the ssa var that we want to | 1123 /* USE_STMT represents a phi node of the ssa var that we want to |
1123 check because it came from some use of matrix | 1124 check because it came from some use of matrix |
1124 MI. | 1125 MI. |
1125 We check all the escaping levels that get to the PHI node | 1126 We check all the escaping levels that get to the PHI node |
1126 and make sure they are all the same escaping; | 1127 and make sure they are all the same escaping; |
1127 if not (which is rare) we let the escaping level be the | 1128 if not (which is rare) we let the escaping level be the |
1128 minimum level that gets into that PHI because starting from | 1129 minimum level that gets into that PHI because starting from |
1129 that level we cannot expect the behavior of the indirections. | 1130 that level we cannot expect the behavior of the indirections. |
1130 CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */ | 1131 CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */ |
1131 | 1132 |
1132 static void | 1133 static void |
1133 analyze_accesses_for_phi_node (struct matrix_info *mi, gimple use_stmt, | 1134 analyze_accesses_for_phi_node (struct matrix_info *mi, gimple use_stmt, |
1134 int current_indirect_level, sbitmap visited, | 1135 int current_indirect_level, sbitmap visited, |
1181 record_accesses); | 1182 record_accesses); |
1182 RESET_BIT (visited, SSA_NAME_VERSION (PHI_RESULT (use_stmt))); | 1183 RESET_BIT (visited, SSA_NAME_VERSION (PHI_RESULT (use_stmt))); |
1183 } | 1184 } |
1184 } | 1185 } |
1185 | 1186 |
1186 /* USE_STMT represents an assign statement (the rhs or lhs include | 1187 /* USE_STMT represents an assign statement (the rhs or lhs include |
1187 the ssa var that we want to check because it came from some use of matrix | 1188 the ssa var that we want to check because it came from some use of matrix |
1188 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */ | 1189 MI. CURRENT_INDIRECT_LEVEL is the indirection level we reached so far. */ |
1189 | 1190 |
1190 static int | 1191 static int |
1191 analyze_accesses_for_assign_stmt (struct matrix_info *mi, tree ssa_var, | 1192 analyze_accesses_for_assign_stmt (struct matrix_info *mi, tree ssa_var, |
1192 gimple use_stmt, int current_indirect_level, | 1193 gimple use_stmt, int current_indirect_level, |
1241 NULL_TREE, l, true); | 1242 NULL_TREE, l, true); |
1242 update_type_size (mi, use_stmt, NULL, l); | 1243 update_type_size (mi, use_stmt, NULL, l); |
1243 } | 1244 } |
1244 return current_indirect_level; | 1245 return current_indirect_level; |
1245 } | 1246 } |
1246 /* Now, check the right-hand-side, to see how the SSA variable | 1247 /* Now, check the right-hand-side, to see how the SSA variable |
1247 is used. */ | 1248 is used. */ |
1248 if (rhs_acc.var_found) | 1249 if (rhs_acc.var_found) |
1249 { | 1250 { |
1250 if (rhs_acc.t_code != INDIRECT_REF | 1251 if (rhs_acc.t_code != INDIRECT_REF |
1251 && rhs_acc.t_code != POINTER_PLUS_EXPR && rhs_acc.t_code != SSA_NAME) | 1252 && rhs_acc.t_code != POINTER_PLUS_EXPR && rhs_acc.t_code != SSA_NAME) |
1312 { | 1313 { |
1313 int l = current_indirect_level; | 1314 int l = current_indirect_level; |
1314 | 1315 |
1315 /* One exception is when we are storing to the matrix | 1316 /* One exception is when we are storing to the matrix |
1316 variable itself; this is the case of malloc, we must make | 1317 variable itself; this is the case of malloc, we must make |
1317 sure that it's the one and only one call to malloc so | 1318 sure that it's the one and only one call to malloc so |
1318 we call analyze_matrix_allocation_site to check | 1319 we call analyze_matrix_allocation_site to check |
1319 this out. */ | 1320 this out. */ |
1320 if (TREE_CODE (lhs) != VAR_DECL || lhs != mi->decl) | 1321 if (TREE_CODE (lhs) != VAR_DECL || lhs != mi->decl) |
1321 mark_min_matrix_escape_level (mi, current_indirect_level, | 1322 mark_min_matrix_escape_level (mi, current_indirect_level, |
1322 use_stmt); | 1323 use_stmt); |
1323 else | 1324 else |
1339 } | 1340 } |
1340 } | 1341 } |
1341 return current_indirect_level; | 1342 return current_indirect_level; |
1342 } | 1343 } |
1343 | 1344 |
1344 /* Given a SSA_VAR (coming from a use statement of the matrix MI), | 1345 /* Given a SSA_VAR (coming from a use statement of the matrix MI), |
1345 follow its uses and level of indirection and find out the minimum | 1346 follow its uses and level of indirection and find out the minimum |
1346 indirection level it escapes in (the highest dimension) and the maximum | 1347 indirection level it escapes in (the highest dimension) and the maximum |
1347 level it is accessed in (this will be the actual dimension of the | 1348 level it is accessed in (this will be the actual dimension of the |
1348 matrix). The information is accumulated in MI. | 1349 matrix). The information is accumulated in MI. |
1349 We look at the immediate uses, if one escapes we finish; if not, | 1350 We look at the immediate uses, if one escapes we finish; if not, |
1395 current_indirect_level, last_op, | 1396 current_indirect_level, last_op, |
1396 visited, record_accesses); | 1397 visited, record_accesses); |
1397 } | 1398 } |
1398 } | 1399 } |
1399 | 1400 |
1400 typedef struct | 1401 typedef struct |
1401 { | 1402 { |
1402 tree fn; | 1403 tree fn; |
1403 gimple stmt; | 1404 gimple stmt; |
1404 } check_var_data; | 1405 } check_var_data; |
1405 | 1406 |
1452 switch (gimple_code (stmt)) | 1453 switch (gimple_code (stmt)) |
1453 { | 1454 { |
1454 case GIMPLE_ASSIGN: | 1455 case GIMPLE_ASSIGN: |
1455 code = gimple_assign_rhs_code (stmt); | 1456 code = gimple_assign_rhs_code (stmt); |
1456 op1 = gimple_assign_rhs1 (stmt); | 1457 op1 = gimple_assign_rhs1 (stmt); |
1457 | 1458 |
1458 switch (code) | 1459 switch (code) |
1459 { | 1460 { |
1460 case POINTER_PLUS_EXPR: | 1461 case POINTER_PLUS_EXPR: |
1461 case PLUS_EXPR: | 1462 case PLUS_EXPR: |
1462 case MINUS_EXPR: | 1463 case MINUS_EXPR: |
1567 | 1568 |
1568 static int | 1569 static int |
1569 check_allocation_function (void **slot, void *data ATTRIBUTE_UNUSED) | 1570 check_allocation_function (void **slot, void *data ATTRIBUTE_UNUSED) |
1570 { | 1571 { |
1571 int level; | 1572 int level; |
1572 gimple_stmt_iterator gsi; | |
1573 basic_block bb_level_0; | |
1574 struct matrix_info *mi = (struct matrix_info *) *slot; | 1573 struct matrix_info *mi = (struct matrix_info *) *slot; |
1575 sbitmap visited; | 1574 sbitmap visited; |
1576 | 1575 |
1577 if (!mi->malloc_for_level) | 1576 if (!mi->malloc_for_level) |
1578 return 1; | 1577 return 1; |
1589 for (level = 1; level < mi->max_malloced_level; level++) | 1588 for (level = 1; level < mi->max_malloced_level; level++) |
1590 if (!mi->malloc_for_level[level]) | 1589 if (!mi->malloc_for_level[level]) |
1591 break; | 1590 break; |
1592 | 1591 |
1593 mark_min_matrix_escape_level (mi, level, NULL); | 1592 mark_min_matrix_escape_level (mi, level, NULL); |
1594 | |
1595 gsi = gsi_for_stmt (mi->malloc_for_level[0]); | |
1596 bb_level_0 = gsi.bb; | |
1597 | 1593 |
1598 /* Check if the expression of the size passed to malloc could be | 1594 /* Check if the expression of the size passed to malloc could be |
1599 pre-calculated before the malloc of level 0. */ | 1595 pre-calculated before the malloc of level 0. */ |
1600 for (level = 1; level < mi->min_indirect_level_escape; level++) | 1596 for (level = 1; level < mi->min_indirect_level_escape; level++) |
1601 { | 1597 { |
1799 defined by the global variables pointing to the matrices of our interest. | 1795 defined by the global variables pointing to the matrices of our interest. |
1800 in each use of the SSA we calculate the offset from the base address | 1796 in each use of the SSA we calculate the offset from the base address |
1801 according to the following equation: | 1797 according to the following equation: |
1802 | 1798 |
1803 a[I1][I2]...[Ik] , where D1..Dk is the length of each dimension and the | 1799 a[I1][I2]...[Ik] , where D1..Dk is the length of each dimension and the |
1804 escaping level is m <= k, and a' is the new allocated matrix, | 1800 escaping level is m <= k, and a' is the new allocated matrix, |
1805 will be translated to : | 1801 will be translated to : |
1806 | 1802 |
1807 b[I(m+1)]...[Ik] | 1803 b[I(m+1)]...[Ik] |
1808 | 1804 |
1809 where | 1805 where |
1810 b = a' + I1*D2...*Dm + I2*D3...Dm + ... + Im | 1806 b = a' + I1*D2...*Dm + I2*D3...Dm + ... + Im |
1811 */ | 1807 */ |
1812 | 1808 |
1813 static int | 1809 static int |
1814 transform_access_sites (void **slot, void *data ATTRIBUTE_UNUSED) | 1810 transform_access_sites (void **slot, void *data ATTRIBUTE_UNUSED) |
1863 == INDIRECT_REF); | 1859 == INDIRECT_REF); |
1864 /* Emit convert statement to convert to type of use. */ | 1860 /* Emit convert statement to convert to type of use. */ |
1865 tmp = create_tmp_var (TREE_TYPE (lhs), "new"); | 1861 tmp = create_tmp_var (TREE_TYPE (lhs), "new"); |
1866 add_referenced_var (tmp); | 1862 add_referenced_var (tmp); |
1867 rhs = gimple_assign_rhs1 (acc_info->stmt); | 1863 rhs = gimple_assign_rhs1 (acc_info->stmt); |
1868 new_stmt = gimple_build_assign (tmp, | 1864 rhs = fold_convert (TREE_TYPE (tmp), |
1869 TREE_OPERAND (rhs, 0)); | 1865 TREE_OPERAND (rhs, 0)); |
1866 new_stmt = gimple_build_assign (tmp, rhs); | |
1870 tmp = make_ssa_name (tmp, new_stmt); | 1867 tmp = make_ssa_name (tmp, new_stmt); |
1871 gimple_assign_set_lhs (new_stmt, tmp); | 1868 gimple_assign_set_lhs (new_stmt, tmp); |
1872 gsi = gsi_for_stmt (acc_info->stmt); | 1869 gsi = gsi_for_stmt (acc_info->stmt); |
1873 gsi_insert_after (&gsi, new_stmt, GSI_SAME_STMT); | 1870 gsi_insert_after (&gsi, new_stmt, GSI_SAME_STMT); |
1874 SET_USE (use_p, tmp); | 1871 SET_USE (use_p, tmp); |
1912 if (!check_transpose_p || mi->is_transposed_p == false) | 1909 if (!check_transpose_p || mi->is_transposed_p == false) |
1913 tmp1 = offset; | 1910 tmp1 = offset; |
1914 else | 1911 else |
1915 { | 1912 { |
1916 tree new_offset; | 1913 tree new_offset; |
1917 tree d_type_size, d_type_size_k; | |
1918 | |
1919 d_type_size = size_int (mi->dimension_type_size[min_escape_l]); | |
1920 d_type_size_k = size_int (mi->dimension_type_size[k + 1]); | |
1921 | 1914 |
1922 new_offset = | 1915 new_offset = |
1923 compute_offset (mi->dimension_type_size[min_escape_l], | 1916 compute_offset (mi->dimension_type_size[min_escape_l], |
1924 mi->dimension_type_size[k + 1], offset); | 1917 mi->dimension_type_size[k + 1], offset); |
1925 | 1918 |
2011 /* Replace multiple mallocs (one for each dimension) to one malloc | 2004 /* Replace multiple mallocs (one for each dimension) to one malloc |
2012 with the size of DIM1*DIM2*...*DIMN*size_of_element | 2005 with the size of DIM1*DIM2*...*DIMN*size_of_element |
2013 Make sure that we hold the size in the malloc site inside a | 2006 Make sure that we hold the size in the malloc site inside a |
2014 new global variable; this way we ensure that the size doesn't | 2007 new global variable; this way we ensure that the size doesn't |
2015 change and it is accessible from all the other functions that | 2008 change and it is accessible from all the other functions that |
2016 uses the matrix. Also, the original calls to free are deleted, | 2009 uses the matrix. Also, the original calls to free are deleted, |
2017 and replaced by a new call to free the flattened matrix. */ | 2010 and replaced by a new call to free the flattened matrix. */ |
2018 | 2011 |
2019 static int | 2012 static int |
2020 transform_allocation_sites (void **slot, void *data ATTRIBUTE_UNUSED) | 2013 transform_allocation_sites (void **slot, void *data ATTRIBUTE_UNUSED) |
2021 { | 2014 { |
2407 gate_matrix_reorg (void) | 2400 gate_matrix_reorg (void) |
2408 { | 2401 { |
2409 return flag_ipa_matrix_reorg && flag_whole_program; | 2402 return flag_ipa_matrix_reorg && flag_whole_program; |
2410 } | 2403 } |
2411 | 2404 |
2412 struct simple_ipa_opt_pass pass_ipa_matrix_reorg = | 2405 struct simple_ipa_opt_pass pass_ipa_matrix_reorg = |
2413 { | 2406 { |
2414 { | 2407 { |
2415 SIMPLE_IPA_PASS, | 2408 SIMPLE_IPA_PASS, |
2416 "matrix-reorg", /* name */ | 2409 "matrix-reorg", /* name */ |
2417 gate_matrix_reorg, /* gate */ | 2410 gate_matrix_reorg, /* gate */ |
2418 matrix_reorg, /* execute */ | 2411 matrix_reorg, /* execute */ |
2419 NULL, /* sub */ | 2412 NULL, /* sub */ |
2420 NULL, /* next */ | 2413 NULL, /* next */ |
2421 0, /* static_pass_number */ | 2414 0, /* static_pass_number */ |
2422 0, /* tv_id */ | 2415 TV_NONE, /* tv_id */ |
2423 0, /* properties_required */ | 2416 0, /* properties_required */ |
2424 PROP_trees, /* properties_provided */ | 2417 0, /* properties_provided */ |
2425 0, /* properties_destroyed */ | 2418 0, /* properties_destroyed */ |
2426 0, /* todo_flags_start */ | 2419 0, /* todo_flags_start */ |
2427 TODO_dump_cgraph | TODO_dump_func /* todo_flags_finish */ | 2420 TODO_dump_cgraph | TODO_dump_func /* todo_flags_finish */ |
2428 } | 2421 } |
2429 }; | 2422 }; |
2430 |