1.This is the general pattern for making programs
faster:
- Understand what makes the program slow. Where does
the program spend most of its time? Chances are good that
a few places account for a lot of time.
- Design and implement a solution that will give you
the most speedup for the least effort.
- Iterate steps 1 and 2 until performance is
satisfactory or until the remaining optimizations are not
worth the effort.
2.What to Measure
The most common thing to measure is CPU time.
An alternative is to measure real time or
"wall clock time" (sometimes just "wall time")-which is the
time an ordinary clock on the wall or a wrist watch shows.
The difference between CPU time and wall time can give some
indication of the time spent waiting for I/O.
CPU time can be divided between user time, the
time spent directly executing your program code, and
system time, the time spent by the operating system
on behalf of your program. All of these measurements help
you understand what is going on in your program when it
executes.
3.Method to timing
1. clock_t start = clock();
subrutine();
clock_t stop = clock();
double totaltime = (double)(stop -start)
/CLOCKS_PER_SEC;
2. statical sampling
4.Principles The 80/20 Rule
In the profiling example, we see that most of the CPU
time is spent in a fraction of the total program. This is
sometimes called the 80/20 rule-that is, 80% of the CPU
time is spent in 20% of the program. These places where the
computer spends most of its time are also called hot
spots, inner loops, and kernels. This is
the code you want to optimize.
5.Amdahl's Law
Suppose the program really spends 80% of its time in one
spot, and suppose you can rewrite this spot to take a
negligible amount of time. The program will now execute in
20% of its original time, meaning that it now runs 5 times
as fast. So an infinite improvement in the hot spot only
gives a 5-fold improvement overall. Getting additional
speedup will then require many optimizations throughout the
program, each improvement yielding only a small change in
overall run time, and it is probable that these
improvements won't be worth the effort.
Knowing how to optimize is an important skill. Knowing
when to stop optimizing is also important. You can use
profiling results to predict the best possible improvement
you can get from any particular optimization. For example,
if your
program spends 20% of its time in one loop,
optimizing that loop can speed up the program at most by
1/(1-0.2) = 1.25-that is, by 25%.
6.What have we learned from this exercise? First, most
optimizations are the result of faster algorithms. Usually,
most of the speedup will come from one or a few
optimizations. (I could have implemented a hash table to
speed up string_to_heap, but the improvement
would have been small. countpairs and
pick_nth are where all the action is.) Always
question numbers to see if they make sense and to make sure
you understand them. (By questioning the time spent by the
output routine, I learned that console text output and
scrolling was ignored by the profiler.) Know when to stop.
(The original program, when compiled with optimization,
takes less than one second to output 200 words.)
[words.cpp]
#include "stdio.h"
#include "stdlib.h"
#include "ctype.h"
#include "string.h"
#include "time.h"
FILE *inp;
int wordcount = 0;
// we don't know how many words there will be, so
// for simplicity, just allocate a big array:
char *words[100000];
// this array points to strings that are allocated
// on the heap
void fatal(char *msg)
{
printf("FATAL ERROR: %s\n", msg);
exit(1);
}
void opentext(char *filename)
{
inp = fopen(filename, "r");
if (!inp) fatal("file open failed");
}
// getword -- return true if you got a word
//
int getword(char *word)
{
int index = 0;
int c = getc(inp);
if (c <= 0) return false;
// scan to alpha
while (!isalpha(c)) {
c = getc(inp);
if (c <= 0) return false;
}
// scan to non-alpha and save in word
while (isalpha(c) && (c >= 0)) {
word[index++] = c;
c = getc(inp);
}
word[index] = 0; /* eos */
/* be careful if c is eof indication */
if (c >= 0) ungetc(c, inp);
return true;
}
// string_to_heap -- move a string to the heap
//
char *string_to_heap(char *s)
{
char *h = (char *) malloc(strlen(s) + 1);
if (!h) fatal("no more memory for words");
strcpy(h, s);
return h;
}
// readwords -- read words to array
//
void readwords()
{
// each word goes here first:
char word[1024];
while (getword(word)) {
// then we figure out how much memory was
// really needed, allocate that on the heap,
// copy word to the heap memory, and save
// a pointer in words array:
words[wordcount++] = string_to_heap(word);
}
}
#define streql(s1, s2) (strcmp(s1, s2) == 0)
// countpairs -- how many pairs (word1, word2)
//
int countpairs(char *word1, char *word2)
{
int count = 0;
int i;
for (i = 0; i < wordcount; i++) {
// find word1:
if (streql(words[i], word1)) {
// if found, check if it's followed by word2:
if (streql(words[(i + 1) % wordcount], word2)) {
// yes, we found (word1, word2)
count++;
}
}
}
return count;
}
// pick_nth -- find word after the nth pair
//
// this works like countpairs(), but it stops
// after finding n words and returns the next
// word
//
char *pick_nth(char *word1, char *word2, int n)
{
int count = 0;
int i;
for (i = 0; i < wordcount; i++) {
if (streql(words[i], word1)) {
if (streql(words[(i + 1) % wordcount], word2)) {
count++;
if (count == n)
return words[(i + 2) % wordcount];
}
}
}
fatal("pick_nth did not find nth tuple");
return NULL; // this avoids a compiler warning, but
// is never executed
}
/* generate -- generate from trigram model */
/**/
void generate(int howmany, char *word1, char *word2)
{
int count;
char sep; /* word separator */
printf("%s %s teststststststssssssssssssssssssss\n", word1, word2);
for (count = 2; count < howmany; count++) {
/* first see how many pairs there are: */
int n = countpairs(word1, word2);
char *newword;
if (n == 0) fatal("no pairs found");
/* now pick a random number from 1 to n: */
n = (rand() % n) + 1;
/* find nth word: */
newword = pick_nth(word1, word2, n);
/* shift */
word1 = word2;
word2 = newword;
/* break the line every 10 words */
if (count % 10 == 0) sep = '\n';
else sep = ' ';
/* and add periods before capitalized words */
if (isupper(word2[0])) printf(".");
printf("%c%s", sep, word2);
}
printf(".\n");
}
int main()
{
// seed the random number generator:
srand((unsigned) time(NULL));
opentext("life.txt");
readwords();
fclose(inp);
generate(200, words[wordcount/2],
words[wordcount/2 + 1]);
getchar(); // this prevents the output window from
// closing until the user has looked at it
return 0;
}
//优化后的程序
[word3.cpp]
#define streql(s1, s2) (strcmp(s1, s2) == 0)
char *string_to_heap(char *s)
{
for (int i = 0; i < wordcount; i++) {
if (streql(s, words[i])) {
return words[i];
}
}
char *h = (char *) malloc(strlen(s) + 1);
if (!h) fatal("no more memory for words");
strcpy(h, s);
return h;
}
// readwords -- read words to array
//
void readwords()
{
// each word goes here first:
char word[1024];
while (getword(word)) {
// then we figure out how much memory was
// really needed, allocate that on the heap,
// copy word to the heap memory, and save
// a pointer in words array:
words[wordcount] = string_to_heap(word);
wordcount++;
}
}
7.malloc
#include <iostream> using namespace std;
class Myobject { public: static Myobject *alloc(); void free(); protected: Myobject *next; // used for allocation static Myobject *freelist;
long x; // your member variables go here, long y; // etc.
};
// declare and initialize empty free list: Myobject *Myobject::freelist = NULL;
Myobject *Myobject::alloc() { Myobject *p; // fast allocation: pop object from freelist if (freelist) { p = freelist; freelist = freelist->next; } else { p = new Myobject; } return p; }
void Myobject::free() { this->next = freelist; freelist = this; }
// quick test and demonstration: // void main() { Myobject *a[10]; cout << "Allocated: "; for (int i = 0; i < 10; i++) { a[i] = Myobject::alloc(); cout << a[i] << " "; } cout << endl; // free the even-numbered objects: cout << "Freed: "; for (i = 0; i < 5; i++) { cout << a[i * 2] << " "; a[i * 2]->free(); } cout << endl; // reallocate the even-numbered objects: cout << "Allocated: "; for (i = 0; i < 5; i++) { a[i * 2] = Myobject::alloc(); cout << a[i * 2] << " "; } cout << endl; }
|
Example 1 Allocating members with new
|
The output is:
Allocated: 0x00780D90 0x00780D50 0x00780D10 0x00780CD0 0x00780C90
0x00780C50 0x00780420 0x007803E0 0x007803A0 0x00780360
Freed: 0x00780D90 0x00780D10 0x00780C90 0x00780420 0x007803A0
Allocated: 0x007803A0 0x00780420 0x00780C90 0x00780D10 0x00780D90
|
Output 1 Allocating members with new |