What service replaces the older indexing service to provide indexed search results?

Microsoft English Query and Full-Text Search

In Designing SQL Server 2000 Databases, 2001

Full-Text Search Architecture

As shown in Figure 8.14, the Microsoft Search Service provides, through an OLE DB interface, SQL Server full-text searches. All the data for the full-text indexes are stored outside SQL Server, and all searching is done through Microsoft Search Service, as opposed to the SQL Server Query engine. When SQL Server’s Query engine determines it needs to perform a full-text search, it accesses the Microsoft Search Service. Microsoft Search Service then executes the query and passes back the results to the query engine.

Figure 8.14. The full-text search architecture.

In addition to performing the searches, the Microsoft Search Service also maintains, populates, and builds the full-text indexes. The Microsoft Search Service maintains a series of files that comprise the full-text indexes in full-text catalogs. Many indexes can be in one catalog, but they must all be from the same database, because a catalog cannot span databases. Even so, their number is limited per server, not database; you are limited to a maximum of 256 full-text catalogs per server. Due to the fact that SQL Server does not maintain full-text indexes, they do not behave like SQL Server indexes. A SQL Server index is always and immediately updated to reflect changes that happen to the data they represent, whereas, in most cases, a full-text index must be explicitly rebuilt to reflect any changes to the data.

Microsoft Search Service

Microsoft Search Service is an external service to SQL Server that must be installed in order to perform full-text searches. Microsoft Search Service is a component that is installed during a typical installation. Although you can have many SQL Server instances running on a single computer, you can have only one instance of the Microsoft Search Service on any computer.

The Microsoft Search Service is external to SQL Server, and it will run in the context of the local system account. However, the account running SQL Server must be an administrator of the Microsoft Search Service. When you install the Microsoft Search Service, this relationship will be set up correctly. To ensure that this relationship is maintained correctly, you should change the account running SQL Server only through the Properties tab of the SQL Server Properties in Enterprise Manager. If you change it elsewhere, such as through the services interface in the control panels, the correct changes will not propagate through to the Microsoft Search Service.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9781928994190500117

Designing and Creating SQL Server Databases

In Designing SQL Server 2000 Databases, 2001

Microsoft Search Service

The Microsoft Search service is responsible for full-text indexing and executing full-text queries against SQL Server. If you have defined any full-text catalogs in your database, the Microsoft Search service is responsible for creating and updating those indexes. Full-text search requests are received by the Microsoft Search service for processing, and search results are returned.

You can configure the autostart property of the Microsoft Search service using the Service Manager utility. The Microsoft Search service runs the mssearch.exe file located in the Program Files\Common Files\System\ MSSearch\in\ directory. It is configured to use the local system account, which offers adequate permissions to complete its tasks. As with the SQL Server service, you can view the memory and processor time being used by the MS DTC service from the Task Manager utility.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9781928994190500075

The Store

Tony Redmond, in Microsoft Exchange Server 2007 with SP1, 2008

5.13 Exchange 2007 Content Indexing

Microsoft introduced content indexing (also called Exchange Search) for mailbox and public databases in Exchange 2000. This implementation used an early version of the Microsoft Search engine, but it proved to be unsuccessful in production because of the load that indexing generated on the server. Over time, we have seen increases in server performance, better knowledge and implementation of search algorithms, and the appearance of efficient desktop search engines that are capable of indexing mailbox data, such as Microsoft's own Windows Desktop Search and the LookOut add-in for Outlook that Microsoft purchased in 2005, plus competitors such as Google Desktop. Microsoft includes the ability for Outlook 2007 to use Windows Desktop Search, but the big issue with desktop indexers is the potential performance impact desktop indexing can provoke on the server. To the server, a desktop indexer can seem to be like a voracious user who keeps on demanding information in a seemingly endless stream of requests from the desktop to the server. It makes sense that it is more efficient to perform indexing once, on the server, and have clients use that index rather than generating local indexes, but users will not move away from local indexes unless they are sure that the server-based equivalent works. Therefore, to avoid the proliferation of local client-based indexes, we need tremendously efficient and responsive server-based indexing. This never happened in Exchange 2003, where indexing was so slow that even Microsoft realized that it was best to disable the feature by default.

Apart from desktop searches, Microsoft supports Exchange search folders and programming interfaces to allow external programs to search within the Store. Huge differences exist between Exchange 2007 content indexing and in-Store searches. Here are some of the major differences. The first three are the most important in practice—content indexing is faster, more accurate, and can look through attachments:

Content indexing is invariably faster because it uses indexes to locate requested information while Store searches have to perform serial scans through everything in the mailbox or folder that you search.

Content indexing allows searches based on words, phrases, and sentences while a Store search uses a stream of bytes. Content indexing therefore ignores punctuation marks and spaces and is case insensitive while Store searches look for exact matches of all supplied characters.

Content indexing supports filters for a number of common attachment types so it can index and search the content of those attachments. Store searches look through MAPI properties, which include message bodies but not attachments. On the other hand, because Store searches can access MAPI properties, you are not limited to content searches and can check date ranges and other properties such as importance, flags, etc.

Content indexing is usually more accurate because it uses word prefix matches as well, so a search for “star” will also discover items that contain “starlet” or “stargazer,” but not “filmstar.” Store searches will find all three instances unless a prefix match is explicitly specified.

Content indexing understands non-English languages, which gives it the ability to support correct searches against content that contains double-byte characters such as Japanese and Chinese. The byte stream searches performed by the Store have great difficulties with double byte languages.

While content indexing offers major advantages in search accuracy, customers will enable it only if it responds quickly to user requests and does not overload the server. These were the major issues with content indexing in previous versions of Exchange, so Microsoft rewrote the entire subsystem for Exchange 2007 based on Version 3.0 of the Microsoft Search engine, which SQL Server 2005 also uses. SharePoint Portal Server 2007 also uses Microsoft Search 3.0 together with some of its own technology.

Microsoft also rewrote the software layer that connects the indexing engine to the Store so that it is much more efficient in how it accesses data. The net effect is that indexing is smooth and consistent and up to 35 times faster (Microsoft's own estimate) in indexing activities. The only real downside is the requirement to allocate between 25% and 30% of the size of the databases for the index. Because indexing now only takes between 3 and 5 percent of server CPU when it is running in a steady state to keep the indexes up to date, Microsoft now enables indexing by default. From a user perspective, searches return within seconds rather than minutes, so there is a reduced desire to use a desktop search albeit recognizing that desktop search engines remain useful when PCs operate disconnected from the network.

Exchange 2007 uses two services to perform indexing.

The Exchange Search Indexer is responsible for creating and maintaining the search index. It monitors the Store for new items and invokes the necessary processing to update the index. For example:

The indexer adds any new items that arrive into mailboxes through email delivery or item creation.

The indexer scans the complete mailbox if a new mailbox is added to a database. This step ensures that any mailbox moved to a server has its contents incorporated into the index as soon as possible.

The indexer scans new databases that are added to the Store.

The Indexer is responsible for locating new items for indexing. When it discovers items that have not been indexed, it sends sets of pointers to these items to the MS-Search engine, which is responsible for indexing the content. Exchange creates a separate catalog folder that contains multiple index files for each mailbox database and the index files occupy between 5% and 10% of the size of the mailbox database.

To retrieve the content, the MS-Search engine uses a special protocol handler to fetch the items from the Store and then pours the content through a set of filters that can understand specific formats such as Word, PDF, HTML, PowerPoint, Excel,11 and so on (at the time of writing, the new Office 2007 formats are not supported—Microsoft may address this omission in Exchange 2007 SP1). It is possible to customize indexing by adding new filters for new formats—but if you add a new filter (which requires you to buy in the filter code—or install a new service pack or hot fix), Exchange will only index new content in that format unless you recreate the entire index.

The output of the filters is plain text content, which MS-Search then processes using a set of word breakers that identify the individual words in the content. The word breakers are able to handle content in different languages, including Japanese and Chinese. One possible complication for content indexing arises when documents are in one language but marked as being in another. For example, if you create an English language document on a PC that runs the Dutch version of Word, Word marks the document as Dutch. When content indexing applies its Dutch word breakers you cannot guarantee that the words that are discovered will make sense for the purpose of searching.

If you look at processes running on an Exchange server, you will see two processes taking up CPU and memory to perform indexing. MSFTESQL.EXE is the core indexer, while MSFTEFD.EXE is a filter daemon that pours content from the Store through the filters and word breakers. Apart from a radical improvement in performance, the biggest difference in the implementation is the move from a “crawl” model (periodic updating of the index) to an “always up to date” model, which means that the Store indexes new items automatically as users create them in the database. A new message is usually indexed within ten seconds of its arrival into an inbox. Of course, because of the increase in data cache enabled by the 64-bit platform, new items can be kept in the cache long enough for them to be indexed, so indexing generates no additional I/O—and new items appear in indexes very soon after they are created.

An Exchange server can encounter various situations that cause a mailbox database to undergo automated or manual maintenance and some of these situations can provoke the need for Exchange to rebuild the content indexes. Table 5-10 summarizes these situations. Note that Exchange does not automatically include the catalog folder in online backups taken using the Exchange backup API, so this is the reason why a full rebuild of the index is required if you have to restore a database from backup. If you take a file-level backup with acquiescent databases, you have the choice to include other files along with the databases and can include the catalog folder. In this scenario, if you have to restore the databases after a failure, you can restore the catalog folder too and so avoid the need for an index rebuild.

Table 5-10. Effect of database operations on indexing

ScenarioActionResalt
Online Backups Backup taken with VSS or streamed to tape. Re-index required if a database is restored from a backup.
File level offline Backups Offline backup to disk or tape. Index data can be backed up and restored (without requiring an update) along with mailbox databases.
Fast recovery Dial-tone recovery using blank mailbox database followed by recovery of the failed database. Exchange creates a new index for the blank database and re-indexes once the failed database is recovered.
LCR Log shipping and application of transactions to local copy of a mailbox database. Exchange creates a new index if you have to switch over to the passive copy.
CCR Switch to copy of the database on a passive node in a MNS cluster. Search continues to work after transition. For lossy failovers, the index is OK and will be updated provided that the loss duration is less than seven days.
SCC Switch to passive node (shared copy cluster). Because only the active node is indexing, the transition causes Exchange to rebuild the index.

Performing a full crawl of a server to populate an index from scratch consumes a lot of CPU and memory and can generate a lot of I/O activity. Depending on the other work that the server is doing, crawling has the potential to disrupt email throughput. To avoid this problem, indexing throttles back its processing demands in periods when server load is high. During normal indexing, the indexer regulates the load that it places on the server by controlling the number of items that it attempts to process per unit of time. A monitoring thread keeps track of the load on each mailbox database by measuring the average latency required to retrieve a document from the database. If the latency crosses a specific threshold (20 milliseconds is the default value), then the indexing engine reduces the load on the server by delaying requests to retrieve information from the Store. Before attempting to fetch a document from the Store, the indexer checks the current processing delay and if it is larger than zero, the indexer goes to sleep for a short time before attempting the fetch. This kind of throttling only occurs during full crawls as the immediate updates to maintain the index do not impose a significant load on a server.

Interestingly, content indexing does not index items that Outlook's junk mail filter detects as spam and moves into the Junk E-Mail folder. However, if you find that a message has been wrongly detected as spam and you move it from the Junk E-Mail folder to any other folder, content indexing will detect the move and include the message in its index.

5.13.1 Using content indexing

Content indexing is configured by default for all databases, so no work is required to set it up. You can check on the current indexing status for all mailbox databases on a server with the Get-MailboxDatabase command:

The databases that are shown with IndexEnabled = True are obviously those that Exchange is currently indexing. To disable content indexing temporarily, you use the Set-MailboxDatabase command:

Remember to re-enable content indexing later on by setting the IndexEnabled property to “True.” If for some reason you want to disable content indexing for all databases on a server, you can either set the IndexEnabled flag to False with the one-line shell command:

Alternatively, you can stop the “Microsoft Exchange Search Indexer” service. Users will experience a radical reduction in performance if they conduct online searches when the indexing service is unavailable, especially when searching through large folders, and they will only be able to search message subjects rather than content. Outlook Web Access signals the potential performance degradation with a pop-up, as shown in Figure 5-24.

Figure 5-24. Content indexes are not available

Another interesting shell command allows you to test the performance of a search operation for a specific mailbox (and the underlying mailbox database). You need to have the permissions to write into the mailbox that you want to test:

Exchange responds to the test by reporting whether it was able to perform a search and the search time. A result of ten or less for the search time is acceptable. A reported search time of -1 indicates failure!

Now that we understand how to enable and test content indexing, we need to understand when it is used. If the mailbox database that hosts the mailbox is enabled for content indexing, all searches executed by Outlook Web Access clients connected to mailboxes in the database use content indexing. This is easy enough to understand because Outlook Web Access works in online mode. Things get a little more complicated with Outlook because it supports online and cached Exchange modes.

When Outlook clients (any version from Outlook 2000) operate in online mode, the vast majority of searches use content indexing providing that the mailbox database is indexed and the Exchange Search service is enabled for the database. In addition, the search query must contain fields that are actually indexed. Most fields are indexed, so this is not usually a problem. However, some queries exist that content indexing cannot satisfy, especially those that invoke complex combinations of NOT, AND, and OR checks against fields or any query against a field that is not indexed, such as a date field. In these cases, Outlook reverts to a Store search. Figure 5-25 shows the Outlook search builder in operation—you can build very complex queries to narrow in on the right information, but it is also probably as easy to confuse users.

Figure 5-25. Building a complex search with Outlook 2007

Outlook 2003 and 2007 clients that operate in cached Exchange mode use client-side linear scans similar to those performed by the Store. Performance is often better than Store searches, but these clients never use Exchange content indexing. You can, of course, install an add-in product that indexes Outlook data and use that to achieve better performance.

Windows Desktop Search uses the same basic indexing and search technology as SharePoint and Exchange, but obviously on a less massive basis. Windows Desktop Search is included in Windows Vista and can be downloaded and installed on Windows XP clients and controlled through Group Policy. On a purely unscientific basis, it seems that Windows Desktop Search functions more smoothly under Vista than Windows XP because it seems to place less strain on the operating system when it indexes items.

Figure 5-26 shows the options that you can configure with Outlook 2007 to control how search operates. Outlook 2007 clients that operate in cached Exchange mode do not use server-based content indexing and use Windows Desktop Search instead. Windows Desktop Search relies on the cached copy of the mailbox plus any PSTs to include messaging data into its index. Note that it is possible for a client to generate searches using Exchange-based indexes and local indexes. For example, if you run Outlook 2007 in online mode and execute a search across your mailbox and a PST, Outlook uses Exchange content indexing to search mailbox content and Windows Desktop Search to search the PST.

Figure 5-26. Outlook 2007 Search Options

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9781555583552500088

Social search engines

Scott Brown, in Social Information, 2012

What is it?

The Samepoint site states, “Since 2008, Samepoint has been providing Social Media Search Results on an international level.” Up until 2012, Samepoint branded itself as a “reputation management search engine,” produced by Darren Culbreath (//www.darrenculbreath.com). In the first half of 2012, the Samepoint interface and access changed, and users are now required to sign in, either by registering on the Samepoint site or by signing in via Twitter or another online property.

Samepoint in its current iteration offers three functional pieces: social search, a “real time” dashboard, and a “top topics/brand search.” The social search allows users to search across social properties as well as conduct domain searches. The “real-time” dashboard provides search results as well as some analysis around sentiment and influencers. The “top topics/brand” search has some pre-determined brand searches set up, and offers a search box as well. We’ll take a look at all three of these functions in our example.

Samepoint is also an example of a social search tool that provides “sentiment” tracking – the ability to track the tone of online comments and conversation – hence, the tagline of being a “reputation management” search engine. Theoretically, organizations can track the tone of conversations happening online, so that they can then take action to manage those conversations, or react in an appropriate manner. I’ll discuss sentiment more in the section “Some initial tips for getting the most out of Samepoint.”

Like all social search engines, Samepoint searches across many different social platforms. Samepoint relies on web search results from Bing, the Microsoft search engine. Though the interface will likely continue to change, the current categories of sources for social search include:

Social media sites: Returns results from a variety of social media properties, including blogs like WordPress and Tumblr, as well as Facebook, Yelp, and LinkedIn.

LinkedIn: Returns search results from LinkedIn Groups, Answers, and Companies. An interesting way to zero in on information specifically being shared in LinkedIn.

Government sites: This essentially allows you to perform a “.gov” domain search, similar to what you could do in Google Advanced Search. A “.gov” domain search primarily returns US government site results.

Education sites: Similar to the Government search, this performs a “.edu” domain search, similar to what you could do in Google Advanced Search. This will return results from any “.edu” domain.

Organizations: Similar to the Government and Education searches, this performs a “.org” domain search, again similar to what you could do in Google Advanced Search. This will return results from any “.org” domain, theoretically non-profit organizations (though this is not always the case).

Military: This function used to return results from social tools used by US military agencies, including the Navy, Air Force, Marines, Army, and Coast Guard. However, as of April 2012, the results from a military search seem to perform only a general web search. This does not seem to be working as of this writing.

Negative: In theory, this capability allows you to quickly identify negative comments, a key functionality in responding to online conversations. We’ll come back to this feature in our discussion of sentiment.

Reviews: Searches across reviews from sources like Amazon, Citysearch, Tripadvisor, and Yelp. With these sources, Samepoint might be particularly useful in finding reviews across a variety of organizations, products, and services, including restaurants and hotels.

In conducting a search, Samepoint pulls back results across all of these sources. It seems to sort results by relevance, which means that, typically, an organization’s top social properties become readily apparent after conducting an overall search.

In previous iterations of Samepoint, results could be further segmented by individual sources, such as individual social sources. However, the version of Samepoint in April 2012 unfortunately does not support this. This is an unfortunate example of a social tool’s losing valuable functionality.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9781843346678500054

Email: The Lifeblood of Modern Communication

Derek L. Hansen, ... Marc A. Smith, in Analyzing Social Media Networks with NodeXL, 2011

8.5.1 Preparing Email

Most email clients do not export data in a format amenable to network analysis. Furthermore, the email you'd like to analyze may be stored in different formats and reside on different computers or web mail servers. As a result, you may need to prepare your email before it is ready to import into network analysis tools such as NodeXL.

The easiest way to transform email messages into network relationships (i.e., an edge list) is to use NodeXL's Import from Email Network feature. This feature relies on the Windows Search utility9 that comes preloaded on recent versions of Windows and can be downloaded for free for older versions. Windows Search indexes files including email messages that are stored, for example, in Thunderbird, Outlook Express, or Office Outlook 2007. It can also be used to access files in shared directories on other computers.

You may not have the email you want to analyze on a local or shared machine. For example, you may exclusively rely on a web mail service such as Gmail or Hotmail. Nearly all web mail services allow you to download local copies of your messages via POP or IMAP to an email client such as Thunderbird, Outlook Express, or Office Outlook 2007. If you are using IMAP, make sure to download the complete email message files, not just the header information. Otherwise Microsoft Search will not index the content of the messages and allow you to import them using NodeXL (as described later). You can typically choose not to download attachments if there are space limitations. If you use IMAP you can also restrict the download by folder. For example, you may want to only download recent messages (i.e., those sent in 2009) rather than years of data. After downloading messages, it may take Microsoft Search some time to index all of the files. If you have subscribed to an email list and retained all of the messages you want to analyze, you can place them in a folder and use IMAP to download just those messages. Figure 8.2 shows an example of Windows Search results (running on Windows XP) after a Gmail folder called 2009 has been downloaded to Outlook Express via IMAP and indexed.

Figure 8.2. Window's search results show all 2174 email messages and the folder they are contained in on the desktop containing the search term “NodeXL.”

Advanced Topic

Working with Large Email Collections

You may want to create networks based on email archives that are not in a format that Windows understands. For example, mbox and maildir are common formats found in Linux and Apple system mail clients. Maildir stores 100 text file per message in a directory hierarchy that matches the user's mail client, whereas mbox stores all messages in a single file.

One strategy for dealing with this issue is to use specialized programs like Mailbag Assistant and Aid4Mail10 that can aggregate email stored in multiple devices or formats, perform and store advanced searches, and export emails into a range of formats. For example, you can use these programs to open email list archive files and convert them to a format such as “eml” files that can be indexed by Microsoft Search. One of the authors has successfully used Mailbag Assistant to work with email collections of approximately 100,000 messages with reasonable performance on a standard machine.

Another strategy is to create a database of the email messages that can be queried in multiple ways. This allows you to apply language processing and text mining approaches not available in the NodeXL import wizard. Some email programs like Mailbag Assistant will create a database for you. Alternatively, you can convert emails into XML11 and then use Excel's built-in XML maps feature to populate the Excel fields based on the XML database content.12

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780123822291000084

Windows Forensic Analysis

Ryan D. Pittman, Dave Shaver, in Handbook of Digital Forensics and Investigation, 2010

Registry Potpourri

It is easy to see that the worth of the Windows registry in computer forensic investigation cannot be overstated. Almost any aspect of a well-planned examination has the potential to uncover pertinent evidence in the registry. Which begs the question, what more is there to find that has not been covered already? The answer is plenty!

For example, some of the most often examined areas of the registry are generally called user assist keys or most recently used (MRU) keys; the subkeys and values in these areas hold information that Windows has deemed important to helping the user perform small tasks on the system, such as opening often used files or speeding up access to certain resources. Consider the following subkey in Windows XP:

NTUSER.DAT\Software\Microsoft\Search Assistant\ACMru

Figure 5.55 demonstrates that the values and data under this subkey (and its children) provide information about user searches via the Windows Search Companion (most often accessed by right-clicking a folder and choosing Search… or from the Start button). The values are incremented beginning at zero, with the lowest numbers representing the most recently searched for items.

Figure 5.55. Examination of the currently logged on user's search activity.

Another helpful location is NTUSER.DAT\Software\Microsoft\Internet Explorer\TypedURLs, which contains a list of the last 25 URLs typed into the Internet Explorer address bar (i.e., pages visited by means other than simply clicking on a link). Like with the values under Search Assistant, the lowest numbered values represent the most recent additions to the list, but the values in TypedURLs begin numbering at one instead of zero (URL1). When the limit of 25 is reached, the values are disposed of in a first in/first out (FIFO) operation.

Programs or locations opened by a user from the Start→Run… location can be found in NTUSER.DAT\Software\Microsoft\Windows\CurrentVersion\Explorer\RunMRU. The values in this location are displayed as a series of letters, beginning at “a,” the key to which lies with the MRUList value (Figure 5.56).

Figure 5.56. RunMRU subkey containing last items run from Start→Run….

The data contained in the MRUList value shows an ordering of the other values based roughly on the last time the value was accessed (although, it actually has more to do with the order in which values are displayed in the drop-down menu under Start→Run…). So, in the example seen in Figure 5.56, the data in value h was the most recently accessed (in this case, opening Windows Explorer to view the System32 folder), with the programs listed in values g, c, f, a, e, b, and d having been last accessed via Start→Run… in that order (descending). However, it must be noted that the MRUList value is updated (and the value list reordered) only when a new program or location that is not already on the list is added (meaning, if the user were to rerun cmd via Start→Run…, the data seen in Figure 5.56 would not change).

These examples are just scratching the surface of the wealth of data contained in the Windows registry that could help examiners. Everything from autostart locations (that are of prime importance in intrusion/malware investigations), to recently opened Office documents, to instant messaging (IM) and peer-to-peer (P2P) data, to shared folders and mapped drives, to default printer information and much more can all be found in the wizard's bag that is the Windows registry; the trick is knowing where to look and how to translate or interpret what resides there.

Tool Feature: Registry Analysis

Many forensic suites, such as EnCase, FTK, and ProDiscover, have specialized functionality or scripts designed to access these (and many more) useful locations in the registry. Other third-party tools like RegRipper, Windows Registry Recovery, and Registry File Viewer (www.snapfiles.com/get/rfv.html) can aid the examiner in a quick rip-and-strip of subkeys of interest in specific registry hives. For a more in-depth look at registry data from a forensic perspective, including a spreadsheet listing useful locations and registry analysis tools on the DVD that accompanies the book, see Carvey (2009).

It should also be noted that, as registry hives have their own internal structures, it is possible to identify and recover deleted registry data using forensic tools like EnCase. The RegLookup tool also includes a recovery algorithm for deleted Registry keys (Morgan, 2008). Another option for recovering deleted registry data is regslack.pl, written by Jolanta Thomassen (www.regripper.net/RegRipper/RegRipper/regslack.zip).

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9780123742674000057

MCSA/MCSE 70–294: Working with Forests and Domains

Michael Cross, ... Thomas W. Shinder Dr.Technical Editor, in MCSE (Exam 70-294) Study Guide, 2004

Domain Rename Conditions and Effects

The domain rename procedure is complex, requires a great deal of care in planning and execution, and should always be tested in a lab environment before performing it on an operational forest. The time required to go through a complete domain rename operation varies; the number of domains, DCs, and member computers is direcdy proportional to the level of effort required.

NOTE

There is a good reason for caution. Read this entire procedure before attempting any part of it, including the pre- and post-procedure steps. You might find limitations that preclude the procedure altogether on your network. Consult Microsoft documentation, read Technet articles, and search for patches, hotfixes, and service packs that can affect domain renaming and forest restructuring. Every attempt is made in this chapter to address all pertinent topics and concerns, but issues and conflicts continue to be exposed over time. Search Microsoft.com for new “Q” articles detailing conditions that might have an affect on this procedure. Most importantly, consider hiring a consultant who has recently and successfully performed a domain renaming operation.

Before undertaking a domain rename operation, you must fully understand the following conditions and effects. They are inherent in the process and must be dealt with or accommodated.

Each DC requires individual attention. Some changes are not replicated throughout the Active Directory. This does not mean that every DC requires a physical visit. Headless management can greatly reduce the level of effort required, depending on the size and structure of the domain and the number of sites it contains.

The entire forest will be out of service for a short period. Close coordination is required with remote sites, especially those in other time zones. During this time, DCs will perform directory database updates and reboot. As with other portions of the procedure, the time involved is proportional to the number of DCs affected.

Any DC that is unreachable or fails to complete the rename process must be eliminated from the forest for you to declare the procedure complete.

Each client workstation requires individual attention. After all DCs have updated and rebooted, each client running Windows 2000 or Windows XP must be rebooted two times to fully adapt to the renamed domain. Windows NT workstations must disjoin from the old domain name and rejoin the new domain name, a manual process that requires a reboot of its own.

The DNS host names of your DCs are not changed automatically by the domain rename process. To make them reflect the new domain name, you must perform the domain controller rename procedure on each DC. Having the host name of a DC decoupled from its domain name does not affect forest service, but the discrepancy will be confusing until you change the names.

The DNS suffix of client workstations and member servers will automatically update through the domain renaming process, but not all computers will match the DNS name of the domain immediately. As with most portions of this process, the period of time required is proportional to the number of hosts in the domain.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9781931836944500106

Administration and Active Directory Integration

In Designing SQL Server 2000 Databases, 2001

Moving and Copying SQL Server Databases

On occasion, administrators find that they must move or copy a database from one server to another. The processes for both are identical. The only difference between a move and a copy is that the original database is deleted in a move. The following method can be used for either copy or move operations:

1.

Back up the database on the source computer.

2.

Back up the database a second time, and verify the backup.

3.

Install an instance of SQL Server on a destination computer.

4.

Configure backup devices on the destination computer.

5.

Review the files on the destination computer to ensure that no duplicate filenames exist. If there are existing files with the same names on the computer and you do not allow those files to be overwritten, there will be an error. If you do allow those files to be overwritten, the files will be overwritten, which might not be desirable.

6.

Review the file and directory structure to ensure that drive letters and folders exist where you will be restoring files. If you restore a file that is located on a nonexistent drive location, there will be an error.

7.

Ensure that the destination computer has Full-Text Search installed and the Microsoft search service started, if the database being copied uses full-text indexing.

8.

Restore the backup to the destination computer. The restore proc-ess will automatically create the appropriate database files.

9.

Validate that the database is accessible by clients.

10.

If you are moving the database, locate the database in Enterprise Manager on the source computer, right-click it, then select Delete from the pop-up menu.

You can use Transact-SQL statements to restore database files. The only person logged in to the database should be the login account executing the restore process—typically the SA account. The following shows the T-SQL statements for restoring database files:

USE master

GO

RESTORE DATABASE databasename

 FILE=‘databasename_data_1’

 FILEGROUP=‘filegroupname’

 FROM backupdevice

 WITH NORECOVERY,

 REPLACE

GO

RESTORE LOG databasename

 FROM backupdevice

 WITH NORECOVERY

GO

RESTORE LOG databasename

 FROM backupdevice

 WITH RECOVERY

GO

Note

When you restore a database onto a different computer, the owner of the database is not the owner of the original database on the source computer. Instead, ownership is assigned to the SQL Server login account that initiated the restore operation. The system administrator (SA), as well as the new owner, can change the ownership if another owner is desired.

The REPLACE statement specifies that files can overwrite existing files in the same location with the same name. The NORECOVERY statement states that files have been modified since the last backup. RECOVERY is used if files have not been modified since the backup. For those files that have been modified, use the RESTORE LOG statement to apply the transaction log backup. When using RESTORE LOG, use NORECOVERY if there are further logs to apply, but use RECOVERY if there are no further logs to apply.

If you are moving the files to new locations, the following example moves a specific set of files and uses the MOVE statement to move files to new locations:

USE master

GO

RESTORE FILELISTONLY

 FROM backupdevice

RESTORE DATABASE databasename

 FROM backupdevice

 WITH NORECOVERY,

 MOVE ‘database_datafile_1’ TO ‘C:\Folder\database_datafile_1.mdf’,

 MOVE ‘database_datafile_2’ TO ‘C:\Folder\database_datafile_2.mdf’

GO

RESTORE LOG databasename

 FROM backupdevice

 WITH NORECOVERY

GO

RESTORE LOG databasename

 FROM backupdevice

 WITH RECOVERY

GO

Enterprise Manager

You can restore files with Enterprise Manager. To do so, follow this procedure:

1.

Log on as a user with administrative privileges.

2.

Click Start | Programs | Microsoft SQL Server | Enterprise Manager.

3.

Navigate to the server, and expand it.

4.

Expand the Databases container below the server.

5.

Right-click the database.

6.

Select All Tasks from the pop-up menu.

7.

Select Restore database.

8.

Type in the name of the database that you will restore.

9.

Select the option “From device,” as shown in Figure 6.26.

Figure 6.26. The Restore database dialog box.

10.

Click Select Devices.

11.

Select Tape or Disk and then the device where the backup exists.

12.

In the Restore Database dialog box, click View contents, and select the backup set that you will restore—or to save time, simply type in the name of the backup set.

13.

On the Restore backup set dialog box, click File or Filegroup, and type the names of the files that will be restored.

Copy Database Wizard

The most convenient way to copy or move a database is to use the Copy Database Wizard. Permissions are important in this operation; you must be an SA on both the source and destination SQL Servers. This method does not require server downtime, and to avoid performance issues, the operation can be executed on a scheduled basis to avoid times when the database undergoes a heavy workload. Another advantage is the easy-to-use GUI Wizard.

The Copy Database Wizard is easy to locate. In Enterprise Manager, click the Tools menu, and select Wizards. (Alternatively, you can right-click a server, select All Tasks, and then select Copy Database Wizard from the pop-up menu.) The screen that you will be presented with will have four expandable nodes. Expand the Management node, and select the Copy Database Wizard, as shown in Figure 6.27. Then click OK.

Figure 6.27. The Management Wizards list.

You should have administrative privileges to run the Copy Database Wizard. When you first start the wizard, you will be presented with a Welcome screen. Click Next to continue. The next dialog box with which you are presented allows you to select the SQL Server that contains your source database. You can accept the default, type in a name, or click the ellipsis (…) button to search for another source server. Then select the authentication method that you will use (either Windows NT authentication or SQL Server authentication), and click Next to continue.

The next screen prompts you to select a destination server. Notice that the default selection is < local >, illustrated in Figure 6.28. This is the reason for executing the Copy Database Wizard on the destination computer. Again, select whether to use Windows authentication or SQL Server Authentication, and click Next to continue.

Figure 6.28. The default destination server is &lt; local &gt; .

Warning

Source and destination servers cannot be the same when you use the Copy Database Wizard. You will not be able to copy a system database, such as msdb or master, with the Copy Database Wizard, either. If you want to execute a special copy or move operation, you can use Data Transformation Services instead.

Your next task is to select the databases that you want to move or copy. You will not be able to move or copy databases that have the same name, nor will you be able to rename the database during the move or copy operation. For each operation, you can mark either the move or copy column to the left of the database in the dialog window, as shown in Figure 6.29, and then click Next to continue.

Figure 6.29. Select the Databases to Move or Copy.

Select any other objects to move or copy on the next screen. Then determine the schedule for the operation.

Once you have completed the Copy Database Wizard, the information that you selected is named and saved as a package on the destination server. This package is saved regardless of whether it is executed immediately or awaiting a scheduled time and date for execution.

Tip

To ensure consistency, make certain that there are no active sessions on the server before you begin the move or copy operation. An active session will prevent the Copy Database Wizard from running.

Detaching and Attaching Databases

If you have multiple instances of SQL Server, you can detach the data and transaction log files of a particular database from one instance and then attach them to another server or even another instance of SQL Server on the same computer. The result of this operation is that the database is attached to the new server, completely intact and in the same condition as when it was detached from its former server. This is an excellent method for moving a database without having to restore a database backup or move the database to a newer, larger storage system.

If you move a database from one physical SQL Server system to another, you first detach the database from the source server. Then you move the database files to the new server. Finally, you attach the database to the new server, indicating the new location of the database files. Additionally, you should remove replication for a detached database. This is done with the sp_removedbreplication statement. You can attach and detach databases using Enterprise Manager as follows:

1.

Log on as an account with administrative privileges.

2.

Click Start | Programs | Microsoft SQL Server | Enterprise Manager.

3.

Navigate the server group to the server that holds the database in question.

4.

Expand the server.

5.

Expand the Databases container.

6.

Right-click the selected database.

7.

Select All Tasks from the pop-up menu.

8.

Select Detach Database. Click OK in the confirmation dialog box, shown in Figure 6.30.

Figure 6.30. Detaching a database.

9.

Click OK to the message that the detaching was completed successfully. When you return to Enterprise Manager, you will see that the database no longer shows up under the Databases container.

10.

To attach a database, right-click the Databases container in Enterprise Manager where the new database will be attached.

11.

Select All Tasks.

12.

Click Attach Database. You will see the dialog box shown in Figure 6.31.

Figure 6.31. Attaching a database.

13.

Click the ellipsis (…) button to search for the database file, and when it is found, click OK. Or if you know the name of the file, type it into the space provided.

14.

Click OK.

15.

Click OK to acknowledge the message that the attachment was successful. The database will now appear within the Databases container.

When a database is detached and later attached to another SQL Server in the same Active Directory domain or another Active Directory domain within the same forest, the permissions within the database do not change. Outside the database, such as the right to logon to the server itself, however, permissions need to be reapplied. Therefore, you need to apply the right to logon to the server to all database users, or they will be denied the right to access the server, even though they have the appropriate permissions within the database itself.

Read full chapter

URL: //www.sciencedirect.com/science/article/pii/B9781928994190500099

Reliability and energy efficiency in cloud computing systems: Survey and taxonomy

Yogesh Sharma, ... Daniel Sun, in Journal of Network and Computer Applications, 2016

3.2 Causes of failures

To make CCS more reliable and available all the time, it is very important to understand the causes of the occurrence of the failures. Various causes of failures in cloud computing are given below in Fig. 3.

Fig. 3. Causes of failure in cloud computing.

3.2.1 Software failure

As software systems and applications are getting complex day by day, they became a significant reason of system breakdown which causes loss in business and revenue. In October 2013, Knight Capital's7 cloud based automatic stock trading software went down for 45 min because of an error in trading algorithm which costed $440 million to the company. Sometimes an unexpected error could occur during the process of updating the software, causing the whole system to crash down. In 2013, cloud services of Microsoft were interrupted for 16 h. It was revealed that they were performing a regular process of updating the firmware in a physical region of the data centers. Something went wrong, which brought down the whole system.8 Another major service outage had seen in January 2015 for 20 min, in which Yahoo Inc. and Microsoft's search engine, Bing, went down during the code update.9 After the crash, the roll back mechanism of Microsoft did not work, which forced the service to shut down from the linked servers to get the point where the system was operating correctly. After a successful update or due to the system maintenance, sometime reboots are scheduled by the service provider about which the service users are informed in advance. Most of the times during planned reboots, service providers consider some backup measures to provide an uninterruptable service to users. On the other hand, unplanned reboots happen after inconsistency in data integration after software or hardware update and the average cost of an unplanned reboot is $9000 per minute. According to Brian Proffitt,10 up to 20% of attempts are failing in the deployment of software as a service due to the problem of data integration. So it is important to shift application design paradigms from machine-based architecture to cloud-based architectures. Some of the other causes of system failure or performance degradation due to the softwares are memory leakage, unterminated threads, data corruption, storage space fragmentation and defragmentation (Vaidyanathan et al., 2001).

3.2.2 Hardware failure

Hardware failure represents around 4% of all the failures occurred in cloud based data centers. Among all the hardware failures/replacements, 78% are hard disk drives (Fig. 4) (Vishwanath and Nagappan, 2010). In 2007, hard disk drives and memory modules were the two most common hardware components sent by Google for repair (Barroso et al., 2013). Hard disk failures increases as the size and age of the clusters increase. Vishwanath and Nagappan (2010), has shown that with age, failure in hard disk drives (HDD) grows exponentially, but after a saturation point it becomes stable. HDD failures can be reduced by timely replacement, and a increase in system reliability will result.

Fig. 4. Percentages of hardware component failures.

3.2.3 Scheduling

In the cloud computing architecture, schedulers are responsible for scheduling the requests on the provisioned resources meeting the user requirements. Requests waiting to get scheduled are initially placed on an input queue. On the basis of the current computing and data resource availability, scheduler schedule the requests in the form of tasks or subtasks to the resources. Being a restricted data structure, queue has a limitation to store a specific number of requests. Exceeding the number of requests than the length of queue will cause drop of new requests and service will be unavailable to the users. This is called overflow failure. To avoid the overflow of queues, timeout value is assigned to each request. If the request waiting time in the queue exceeds the specified time out value, then the request will be dropped from the one to make way for fresh requests. This is called timeout failure. This will lead to the service outage in terms of SLA violation due to the delay in cloud computing services. Failure prediction (Salfner et al., 2010) plays an vital role in identifying system resources that are prone to failure. Scheduler can then avoid placing tasks on those resources that are less reliable. The more accuracy of the prediction means less failure in the services.

3.2.4 Service failure

In CCS, service failure can happen with or without resource failure. As stated by Dai et al. (2010), the cause of the cloud service failure depends upon the stage of the submitted job such that request stage and executing stage. During the request stage, all the requests with service requirements submitted by users are kept in the ready queue. During this stage, users may not be able to access the services because of overflow or time-out that happens due to overloading of resources such that during peak hours. In such case, the underlying resources are working fine but they are unable to accommodate more requests and service failure happens. On the other hand, at execution stage, requests are submitted to underlying physical resources. If services get interrupted, it means the cause of service failure is the outage of resources.

3.2.5 Power outage

In cloud based data centers, about 33% of the service degradation has happened due to the power outage. This happens because of natural disasters or war zones. In 2012, out of 27 major outages of cloud computing services, 6 were caused by the hurricane Sandy alone.11 In 2011, massive tsunami in Japan put the whole country in power crisis for a long time, and all the consumer services were affected. It is estimated that natural disasters contribute around 22% in cloud computing service outage. An another major cause of power outage is UPS system failures, which contributes 25% of total power outage failures and cost around $1000 per incident.

3.2.6 Denser system packaging

Whatever the infrastructure was built ten years ago is now outdated because the data storage has increased exponentially. Designers have begun to design very dense servers like blade servers to keep the storage space low. Total floor space required to setup an IT infrastructure has reduced by 65%,12 which increased devise density per square feet and outage cost has risen to $99 per square feet. As a result of the high devise density, heat release increases, which causes a rise in temperature and this affects the working of devices. Facebook has revealed that by packing the machines densely, electrical current began to overheat and melt Ethernet sockets and other crucial components. In 2013 data centers of Microsoft faced a severe outage of 16 h that affected its cloud services including Outlook, Hotmail, SkyDrive and Microsoft's image sharing service13 due to overheating issues.

3.2.7 Network infrastructure

In distributed computing architecture, specifically in the case of cloud computing, all the services are provided by communication networks. The whole information has been stored and exchanged between servers by using the networks. The outage of the underlying network results in the outage of the services of a CCS. For few cloud based applications such as real time applications, performance of networks plays a key role. A small increment in the network delay can be termed as an SLA violation which will be considered as a service failure. The network services could be broken physically or logically. Around 3% of the service failures happened due to the loss of network connectivity. There are various challenges corresponding to the networks such as hop count, bandwidth, encryption, etc that need to be taken care of to make cloud computing services reliable.

3.2.8 Cyber attacks

Cyber attacks are the fastest growing reason of the data center outages. According to Ponemon Institute report (P. Institute, 2016), the percentage of data center outages due to cyber attacks was 2% in 2010, which had risen to 18% by 2013 and the latest percentage is 22%. The average downtime cost of outage by cyber attacks is $822,000. IBM's report on cyber security intelligence14 has argued that 55% of cyber crimes or threats were from people having access to organization's systems, such that employs. Among other technical issues such as trojan attacks and software loopholes, social engineering (Abraham and Chengalur-Smith, 2010) is a major cause of cyber attacks. In social engineering attackers play with human psyche by exploiting them with emotions, fear, greed, etc and manipulate them to leak the confidential information.

3.2.9 Human errors

Along with cyber attacks, human errors also has a big weight (22%) for the causes of failures in CCS with average cost of $489 per incident. But it has been argued by Schroeder and Gibson (2010) that the lack of experience is a main reason of occurrence of human errors. In the survey done by Bianca, it has been seen that the proportion of human errors is higher during the initial days of deployment of infrastructure. This clearly shows that administrators gains more experience with the time, which reduces the occurrence of human errors. Similar to cyber attacks, social engineering is also a reason for human errors.

Read full article

URL: //www.sciencedirect.com/science/article/pii/S1084804516301746

What is Jsmith's effective access?

What Jsmith's effective access? It keeps its previous permissions settings in the new folder. What happens to the NTFS permissions applied to a folder when the folder is moved to a different folder in the same volume? The file inherits the permissions of the parent folder to which it is moved.

What happens when a NTFS folder is moved to a different partition or volume?

When you move a folder or file to a different NTFS partition, the folder or file inherits the permissions of the destination folder. When you move a folder or file between partitions, Windows Server 2003 copies the folder or file to the new location and then deletes it from the old location.

What happens when a folder with NTFS permissions is copied?

What happens when a folder with NTFS permissions is copied to shared folder on a FAT volume? The folder inherits the share permissions, but loses the NTFS permissions.

What character should you place at the end of a share name?

The $ at the end of the share name hides the share from people browsing. For instance, if you have a share "\computer1\share1$", anyone who browses to "\computer1" will not see share1 listed.

Toplist

Neuester Beitrag

Stichworte