Let's take them one step at a time.
First of all, a sitemap can't have more than 50k URLs. But you can create a sitemap that contains links to other sitemaps. Read
sitemaps.org - Home
Regarding the time it takes to create the sitemaps and data changing:
- A sitemap just contains the URLs. If you're aiming that high (1 mln pages) you should have a system that generates sitemaps dinamically already in place. Data should never change, URLs shouldn't change. It doesn't matter what content they have as sitemaps only contain the URL of the page, not the content (submit sitemaps to Google please, not feeds). Also, in a sitemap, you can specify how often the content of a page is changing so Google knows how often it should crawl it.
Regarding the different number of indexed pages:
- The reason for Google having more domains, google.ie, google.com and so on, is so it can target results by location. It is different as the content of an Irish only tourism site for example is more relevant to google.ie than google.com if you're searching for hotels for example.
- You should never have both
Example Web Page and example.com. www is actually a subdomain and it has been used as the actual webpage URL. It should redirect to the domain name (example.com) or the domain should redirect back to the subdomain (
Example Web Page). Having them both up results in duplicate data so most likely Google will penalize both websites (yes, it's the same for us, but for Google those are two separate websites). So put the redirection in place and you'll get rid of a lot more problems.
Also, keep in mind that Google will not index all of your pages even if you do submit a sitemap for all of them. The pages have to be relevant and have backlinks.
Hope this helps!
Regards,
Claudiu