dserban
Members-
Content Count
282 -
Joined
-
Last visited
Everything posted by dserban
-
This posting is best viewed in Firefox. *** IE *** First, let me begin by saying that I'm not scared of using Internet Explorer. I have done a couple of things to it to turn it into a well-performing and relatively secure application. 1. I downloaded ToolbarCop at: http://forums.xisto.com/no_longer_exists/ Then I went on a cleaning spree, removing all suspicious-looking browser helper objects, including plug-ins for various things such as Acrobat Reader. All these toolbars and plug-ins do not belong in my IE, with maybe one notable exception - Java from Sun Microsystems. 2. I went into all shortcuts to Internet Explorer (desktop, QuickLaunch, start menu etc.) and I put the string " -nohome" at the end of the "Target" text field, like this: "C:\Program Files\Internet Explorer\iexplore.exe" -nohome I'm amazed at how much faster IE starts now. 3. I downloaded and installed Privoxy at: https://sourceforge.net/projects/ijbswa/ It's an ad blocker (among many many other things) in the form of a web proxy. And I discovered a super-cool feature it has: you can use its activity log as a URL snooper. More on that in a moment. That would be it for Internet Explorer. If you are doing additional things to IE to make it a well-behaved application, please let me know. *** FireFox *** Love it :heart: :heart: :heart: The sheer number of generously contributed extensions / add-ons makes it a very attractive proposition if you want more from a browser than just occasional surfing and checking your web-based e-mail. These add-ons are a very powerful differentiator from other would-be browsers. However, despite best efforts, FireFox is a memory hog, at least on my PC. I have 4 add-ons installed in FireFox (CacheViewer, DownThemAll, Greasemonkey and McAfee SiteAdvisor) of which only one stays enabled all the time - Greasemonkey - I only enable the rest of it on an as needed basis. It is worth at this point to explain a usage pattern involving the combination of IE+Privoxy and FireFox+CacheViewer that I developed over time. I sometimes find a very funny home-made movie that someone posted on the Internet on some obscure video sharing site, and I would like to have a copy of the FLV file on my PC. The sequence of steps to do that is: 1. Right-click on the Privoxy icon located in the system tray (the blue circle with a capital P inside of it) and select "Show Privoxy Window", and from the Privoxy menu select "View" - "Clear log" to clear up all the clutter that has accumulated in the activity log. 2. Open the video link in Internet Explorer and watch the first few seconds. You don't need to watch the whole movie, you just need to make sure that IE got to the point where it requested / touched the FLV file at the remote location. 3. At this point there is a trace of the full URL to the FLV file in the Privoxy activity log (minus the "http://forums.xisto.com/; at the beginning, but no big deal). 4. Now open the same video link in FireFox, assuming you have installed the CacheViewer extension for it. This time you wait until the movie is completely loaded (let the progress bar of the embedded FLV player reach the 100% mark). 5. Select "Tools" - "CacheViewer" in the FireFox menu and sort the list of cached files by size in descending order. Locate the entry that corresponds to the FLV URL you located a few moments ago in the Privoxy trace, right-click on it and select "Save As ..." *** K-meleon *** K-meleon is basically Firefox without the sophisticated features and, importantly, without the bloat. The code base behind K-meleon and FireFox is the same. I use it as a no-frills replacement to Firefox when my PC grinds to a halt from too many memory-hungry applications. *** OffByOne *** Portable, extremely low-memory-footprint, but still usable browser to send to your friends via Yahoo-messenger file-transfer facility for those rare occasions when their installation of IE becomes corrupt or unusable and they have no way to go out and download FireFox. *** lynx / links *** For showing your friends what web browsing used to feel like in the stone age of the Internet. *** Konqueror *** Love it, love it, love it :heart: :heart: :heart: :heart: :heart: :heart: My browser of choice in a Linux KDE environment. Beats the living browsing out of FireFox by a factor of 10-20. I wish there was a Windows version for it. As it is, you need to have a Windows-based X server like Cygwin to be able to run it. I hate to imagine how popular Konqueror would be if it were available for Windows and it had the extensibility via add-ons that FireFox has. *** Opera / Safari *** Used them occasionally, wasn't terribly impressed, don't know enough about them to make an intelligent comment. w3schools statistics reveal that Internet Explorer is the most common browser. However, FireFox has become quite popular as well. http://www.w3schools.com/browsers/default.asp See also: http://gorelki.com/ Firefox has achieved what many thought impossible and overturned Microsoft's browser monopoly. The company recently announced that the browser had achieved 400 million downloads. Article "Firefox: We caught Microsoft asleep at the wheel": http://www.alphr.com/news/internet/124630/firefox-we-caught-microsoft-asleep-at-the-wheel Who ISN'T catching Microsoft asleep at the wheel these days? They're becoming the classic example of a company losing direction and resting on their past successes. Nobody with real power in Microsoft's upper tier of management cares much anymore. They're just waiting to retire and cash in their stock options. If they can limp through the next decade, some of the young guys (who are probably engineers today) will revitalize the company once the dead-weight leaves. You can thank the Mozilla team for all the improvements in IE 7. Basically, Microsoft sat on their tails for years until there was some real competition. Their refusal to support png transparency was proof enough of their being asleep at the wheel. Their current refusal to support svg images is proof that they still are. Numbers are going to be different depending on who is doing the data gathering. One website that averages about 1000 unique visits a day might see about 75% of its traffic from Firefox users. Another one that averages about 400 unique visitors a day might have only 30% of its visitors using Firefox. Coding still needs to be done for more than 1 browser. Most designers always try to code for IE 6 and 7, FireFox 2 and Safari (even though safari is usually less than 1%). Most designers can't imagine doing any sort of web application development without Firefox. DOM inspector, Web Developer, and Firebug have become essential add-ons that a serious designer couldn't live without. Even if you always test against IE, Opera and Safari, when it comes to debugging code or layout issues, Firefox has the best integrated tools. The biggest thing that motivates people to use FireFox is the add-ons. AdBlocker alone saves many people's sanity daily. And no, it doesn't kill advertising revenue - people still click on the interesting Google text ads, just no flashy in your face graphics, pop ups, pop unders, overs or outs. FireFox got it's mojo on thanks in part to the absolute fiasco IE 6 was. IE 6 introduced the world to hijacks, spyware, adware, and the like, some of which would be so difficult to remove that it was less time consuming to crush and rebuild the things. The AV companies did not offer a solution early on, everyone sat on their tails pointing fingers. It was a mess. IE 6 was the single most destructive thing to ever happen to the internet. Mozilla exists to make better software. Microsoft exists to extract money out of their monopoly. Microsoft hasn't cared about improving their products for years. The best you could say is that they are trying to band-aid all the ridiculous deficiencies from their prior versions. IE 7 would have never happened but for FireFox, and as it is, it is a pretty pathetic attempt that doesn't even catch up to Firefox. Microsoft is years behind the curve and will never catch up. Their management is criminally incompetent, and all the tremendous talent and resources they have at the technical engineering level is completely nullified by the evil, bloated, hateful bastards at the top who have nothing but contempt for their users, the industry and computers as a whole. Microsoft's problem is that it hates everything and everyone except money. Another issue concerns ads. In the world of advertising, nothing is really free. When you listen to a radio or watch a television show, advertisers pay the costs and earn the right to broadcast their messages any time they want. Most people tolerate radio and television advertising since they've grown accustomed to its constant interruptions. However, in the world of the Internet, people have a much lower tolerance level for advertisements. While advertisements pay for many free web hosting services and free or low-cost Internet services, there's a fine line between product promotion and invasion of privacy. When you hear or see a commercial on radio or television, you can freely ignore it. Unfortunately, advertisements on the Internet aren't always like that. Ideally, an Internet advertisement would pop up once and give you the option of making it go away. Instead, Internet advertisements not only pop-up (and keep popping up over and over again), but they may also track which web pages you visit, to determine your preferences, which would be like having a radio or TV that could peek into your living room to see which brand of potato chips you might be eating at the moment. To intrude upon your privacy, Internet advertisers use a variety of tools including web bugs, adware, and a never-ending cascade of pop-up windows. Advertisers always need to know how effective their current marketing campaign may be. Since the Internet spans the world, it's nearly impossible to tell how many people looked at a particular ad and who they might be. To solve these two problems, advertisers created web bugs. When you visit a website, your browser asks the website server to send your computer all the text and graphic images that make up the web page. Thus, every webserver needs to know the IP address of your computer so it knows where to send the text and graphics. When your browser receives information about a web page, that information appears in the form of HTML (Hypertext Markup Language) code, which tells your browser exactly how to display and position text and graphics. The specific HTML code that your browser receives from a web page defines the name of the graphic file, its size, and the name of the server it came from. In the following HTML example, the graphic file is called dotclear.gif, its width and height are both one pixel, and the server it came from is ad.doubleclick.net. Web bugs hide on ordinary web pages as invisible, one pixel by one pixel size images so you won't notice when you're being tracked. When the server sends the web bug to your browser, the server can immediately identify the following: - The IP address of the computer that fetched the web bug - The specific web page that contains the web bug (useful for seeing which web pages someone might have visited) - The time and date the web bug was retrieved - The type of browser that fetched the web bug At the simplest level, web bugs help advertisers determine how many people have visited a particular website and viewed a particular web page. On a more insidious level, web bugs can work with cookies to track which websites each person visits so they can display advertisements specific to that individual. Web bugs can sometimes appear in spam too, buried inside email so an advertiser can see how many times people read (or at least open) a particular message. If someone doesn't bother to view a web bug in an email, this tells the advertiser that the email address may not be valid or that this particular person didn't bother to read it. In either case, the advertiser will likely remove that person's email address to avoid wasting time sending advertisements that no one will read.
-
There are many downloaders out there, some of them free, some of them payware. My personal favorite is CMS Grabber, which runs as a standalone program, so don't even need to install it. It removes all the inconveniences related to the waiting time, and it shows the captcha image right there in a section of the application screen itself. It doesn't however remove the problem related to the IP address. Flushing the DNS / releasing / renewing works when your PC is directly connected to the Internet, but I'm behind a Netgear router, so what I do is I log on to the HTTP interface of my router, and on the admin page, under "Router status" - "Connection status" I click "Disconnect", then I go to my browser and attempt to access something on the Internet. And then my router's built-in DHCP client requests a new IP address from the ISP - voila. Not all routers are the same in this respect though. I know from personal experience that Belkin routers give you a hard time when you try to do that.
-
After spending a couple of days trying to optimize various aspects of my WinXP installation, I managed to reduce the list of services running on my PC to the following:Application Layer Gateway ServiceAtheros Configuration ServiceCOM+ Event SystemDHCP ClientEvent LogNetwork ConnectionsNetwork Location Awareness (NLA)Plug and PlayPrint SpoolerRemote Procedure Call (RPC)Security CenterTask SchedulerWindows AudioWindows Firewall/Internet Connection Sharing (ICS)Windows Management InstrumentationWireless Zero ConfigurationMy PC connects to the home router via a TrendNet wireless USB key, which is the reason I have those two services - Atheros and Zero Config - running.I'm wondering if there are still services which I might as well disable without affecting the normal operation my PC, or if I disabled too much.I'm especially curious about "COM+ Event System", I tried to make sense of its description in the services applet, but I find it very obscure.
-
For slow PCs where physical security isn't an issue, you might argue that it makes sense to enable this feature so that one can turn on the PC, go do something else and come back after 5 minutes to a fully loaded user environment, without having to come back and check inbetween whether you have a logon box waiting for you.
-
You might find this Microsoft TechNet article useful. http://www.microsoft.com/err/technet/ autolog.exe is available in 2 variants: - the one provided by Mark Russinovich (link above) - the one that comes with the Win32 resource kit
-
64GB is about the amount of space that the nightly backup of a medium-size company would take.So I guess this would be a viable alternative to tape backups for these companies.So the high-availability / disaster recovery strategy of such a company would greatly benefit from putting the backups on flash drives instead of tapes.
-
Yahoo Pager Registry Entry Nuisance
dserban replied to dserban's topic in Websites and Web Designing
Oh ... ummmmm ... duh!!! thanks -
I am using Yahoo Messenger only intermittently and I am very annoyed by the fact that whenever I start it, it automatically seeds itself as a startup program in the registry - without asking me for permission - by putting an entry with the following profile:_________________Yahoo! PagerRun - Startup"C:\Program Files\Yahoo!\Messenger\YahooMessenger.exe" -quietEnabledCurrent User_________________This is not the only piece of software that behaves like this.The QuickTime movie player inserts an autorun entry by the name of qttask.exe - I don't know what it does, but it certainly doesn't do anything that is useful to me, so I found a solution for it.The RealPlayer does the same thing, the name of the autorun is realsched.exe - I have applied the same solution to stop it in its tracks.As you can tell, I'm one of those people who go by the principle of "If it doesn't do something that is directly and immediately useful to me, it doesn't belong in my process list".The solution I'm talking about consists of a small executable file that does absolutely nothing and returns an exit code of 0. I made two copies of this program, I named them qttask.exe and realsched.exe and copied the respective files into the folders where the original "perpetrator" files reside, with Overwrite-Yes. Problem solved.But in these two cases I was able to leverage the fact that qttask.exe and realsched.exe were separate programs, dinstinct from the media players, so I was still going to be able to use the respective media players without a problem.However, Yahoo Messenger is different. The registry entry contains the name of the useful program, but with the switch "-quiet", so I can't apply the above trick.Does anyone know how to keep the dirty hands of Yahoo Messenger away from the registry?
-
For computing the difference in days between two dates, you can use the Excel function DATEDIF, one of the tutorials that shows you how is "Calculating an age between two dates using the DATEDIF function": http://www.meadinkent.co.uk/xl_birthday.htm
-
Hi Chesso, It's not very clear from your description what your objective is, but based on a pure guess I would say you need to convert the two columns you are working with into one column by either: - concatenating the two dates into something like 2007013120070731 - computing the difference in days between the two dates Once you have reached that stage you need to apply the VLOOKUP function correctly. I will show you how to do that because most people I have worked with are confused on how to use this function correctly. Here is a simple example setup of an Excel file along with instructions on how to make the column selections when you use VLOOKUP: After you clicked inside the Table_array text box, you need to select both coulumns A and B by first clicking the column header A, keeping the mouse button down, then moving the cursor over column B and releasing the mouse button. I find that it's here that most people get confused. Also, make sure that Range_lookup = FALSE, otherwise you get unpredictable results.
-
There are several options if you want to have your video on a site: a) use the embedded Windows Media Player, as has been described above Pros: - you can use your video file as-is, without conversion, as long as WMP can play it. Cons: - the user will sit there and wait until the movie has finished downloading in the background, not aware of the progress of the download, unable to make the decision whether or not to keep downloading the whole movie after watching the first few seconds / minutes. convert your movie into the SWF format and use it in conjunction with a relatively sophisticated, XML-configurable SWF player that has a preload option. An example of this can be found here, as well as on many other sites: http://forums.xisto.com/no_longer_exists/ Pros: - the user can see the progress bar and is aware of the movie downloading - the movie will start automatically after one third of it has been downloaded, thus avoiding playback fragmentation that would happen when the playback rate is faster than the download rate. Cons: - relatively difficult to set up, requires an advanced degree of understanding of HTML, XML, and of the interaction between the SWF player and the rest of the web page. - requires conversion of video file into the SWF format (is there free, open source software to do that? I don't know). - not entirely sure about copyright issues, the SWF player on that example site I gave you seems to have been "borrowed" from the desktop training CDs made by a company called Linux CBT. c) convert your movie into the FLV format and use it in conjunction with an open source player that is relatively small in size (29KB), called flvplayer.swf. This is the method I personally prefer for my purposes. An example of this can be found on one of my blogs at: //dead link// This one is a vacation video that my girlfriend took, and its original format as it came out the digital camera was MPEG. Pros: - progress bar, pause button and the capability to view the movie in full screen mode - the setup is relatively simple, you can just copy the HTML code from the webpage I gave you, adjust the width and height according to the original resolution of your video to avoid pixelation and you're good to go (you need to add 22 pixels to the height to account for the control bar though). Cons: - requires conversion of video file into the FLV format (there is free, open source software to do that - it's called Riva FLV Encoder).
-
I was thinking more along the lines of a command-line-based SWF player that I can run several times, one for each SWF file, and put the whole thing in an endless-loop type script.Anyway, I have zero knowledge when it comes to developing flash stuff.
-
I give it a 10 for their search engine, email service and google Earth, an 8 for their maps service, a 6 for their google pages feature, a 1 for having to look past google adsense BS to get to the useful content on a page, a 1 for corporate corruption, a 1 for violating my privacy when they collect data about what I look for on the Internet.All in all, I give Google a rating of 4.
-
Sometimes I find free software on the Internet which does something useful for me, and also comes out clean and green after running it through antivirus and spyware checks, but I still would like to make sure it doesn't send out any data over the Internet to some obscure site.So the question is:How can I isolate a specific piece of software running on my Internet-connected PC from being able to connect to the Internet?I want the software to think it's running on a PC the network cards of which are all down.
-
Oracle themselves are offering such a service for free to anyone who is interested. The link is: /en/ You just have to create an account, choose the 2MB or 5MB storage option, then figure out the database connect string. This is specifically created to host Oracle Application Express applications, which are the equivalent of more established web application servers like BEA WebLogic, Tomcat, JBoss etc. the difference being that the web rendering engine is PL/SQL instead of Java.
-
I have a collection of several locally stored, sound-only .swf files and I'm looking for a player that can build a playlist out of them and cycle through them over and over again.It could be a flash-based player or a standalone piece of software, I don't care.Anyone have any ideas?
-
I'm not sure whether I should start a new topic for this question, but I am curious to know if you are happy with DownThemAll ... have you tested FlashGet and you didn't like it, then switched to DownThemAll ... can you make a comparison.I currently have neither because I don't understand the concept of an accelerator. Specifically, I don't get how you can "steal" from your ISP more juice than they give you by using traffic shaping tools.But someone might be able to convince me.
-
I have the following setup on my PC:- An external USB connected hard drive / enclosure;- Several USB memory sticks of which I'm using one or two or none at a given point in time;- 2 pieces of software for mounting ISO images where I would like to have a variable number of configured drive letters.My objective is to reserve a fixed drive letter for the external enclosure so that references to it will be dependable in the future.Right now it is I: one day, it could be G: the next day, and H: the day after that.Any ideas?
-
Mine says:Your current bandwidth reading is:450.40kbpsIt's nice to have a very simple bandwidth tester to point your friends to.speedtest.net is OK, but I don't like two things about it:- it's flash based- you have to select a server yourself (this might be its strength in other people's opinion)Does anyone know a purely text based bandwidth tester site?I remember seeing a Youtube video a while ago where they took pictures of the same dual-boot PC connecting to a purely text based bandwidth tester site, first under Windows XP, then under Ubuntu, and making a point about the difference in speed.But I couldn't catch the name of the site.
-
Excellent tip, thanks a lot !!!I've been exploring Wink for the past couple of hours and it looks promising.What I would like to be able to do with Wink is take still frames from one .wnk project and insert them into a second .wnk project, kind of like alternating live-captured sequences and interspersing Powerpoint slides.I'm still researching the help for how to do that.I know Wink has a feature that allows you to insert text boxes, but I want a strict separation between comments and live video, That's why I prefer Powerpoint slides.Also, I would like to be able to do fade-out-to-black and fade-in-from-black transition effects, just like Windows Movie Maker allows you to do.By the way, I'm not happy with the degradation in video quality that Windows Movie Maker produces, So I'm looking for alternative means for how to create those transition effects.Any ideas would be greatly appreciated.
-
I am currently using the freeware version of CamStudio to create screencast videos, which I then convert to FLV, and I post the FLV movies on the intranet at work.I am happy with the video quality of these videos.However with CamStudio I have a small problem with the spacebar.It is the default command key to stop the screen recording, and I although I looked everywhere, I couldn't find where I can change this.I sometimes need to write stuff in text fields and text boxes, and it is very annoying that I have to find various workarounds - like using a dot instead of the spacebar - to make it work.Before using CamStudio, I was using Lotus ScreenCam, but they store video files in a proprietary format (.scm) which no other video conversion software knows what to do with.Besides, Lotus ScreenCam has been desupported.Can anyone recommend some good screen capture video software that they have experience with and that works well?
-
Like most people, I use Google most of the time. Likewise, I too sometimes get overwhelmed by the avalanche of results. That's about the only thing that makes Google useless once in a while. When I have this problem, I go to http://www-01.ibm.com/software/data/information-optimization/ ... they have an interesting concept for clustering results. Back to Google though. When I look for podcasts (wink, wink) on Google, I put the string inurl:mp3 after the key words. For example: "Frank Sinatra" inurl:mp3 Make sure you don't do anything illegal while looking for those podcasts (wink, wink).
-
Bash Tips And Tricks From My Own Experience - Several Instalments
dserban replied to dserban's topic in Programming
Instalment 2: - How I use POSIX utilities to process text One of the problems I have very often is that I would like to extract usable links from a very large web page the source code of which looks messy (when I look at it using "View Source"). Have you ever looked at a cool web page with nice visual effects and the first thing that comes to your mind is "How did he do that? I would also like to have those effects on my own site." Or otherwise the page has lots of links to pretty pictures and you just want the links without the other "fat". And you go to view the source and it looks like this: <html><head><title>Fun Pics</title></head><body><a href=http://forums.xisto.com/no_longer_exists/ 1</a><br><a href=http://forums.xisto.com/no_longer_exists/ 2</a><br><a href=http://forums.xisto.com/no_longer_exists/ 3</a><br><a href=http://forums.xisto.com/no_longer_exists/ 4</a><br><a href=http://forums.xisto.com/no_longer_exists/ 5</a><br><a href=http://forums.xisto.com/no_longer_exists/ 6</a><br><a href=http://forums.xisto.com/no_longer_exists/ 7</a></body></html> Now you want to make some sense out of that never ending HTML string. You probably already have your favorite tool handy to do that easily, and that's fine, but my objective here is to showcase the use of POSIX utilities, that's why I will show you what might look like the hard way of doing it. For simplicity, I will assume that the above HTML code is stored in file messyHTML.html. My objective is to obtain a clean list of all the links pointing to jpg files, each one on its own line. Something like this: http://forums.xisto.com/no_longer_exists/='>http://forums.xisto.com/no_longer_exists/="http://forums.xisto.com/no_longer_exists/"&; First of all, I would like to get a rough overview of the structure of the web page by having each HTML tag on its own line: > <#g' <html> <head> <title>Fun Pics </title> </head> <body> <a href=http linenums:0'># cat messyHTML.html | sed 's#<#\> <#g'<html><head><title>Fun Pics</title></head><body><a href=http://forums.xisto.com/no_longer_exists/ 1</a><br><a href=http://forums.xisto.com/no_longer_exists/ 2</a><br><a href=http://forums.xisto.com/no_longer_exists/ 3</a><br><a href=http://forums.xisto.com/no_longer_exists/ 4</a><br><a href=http://forums.xisto.com/no_longer_exists/ 5</a><br><a href=http://forums.xisto.com/no_longer_exists/ 6</a><br><a href=http://forums.xisto.com/no_longer_exists/ 7</a></body></html># OK, so what did I do? Let's look at the command again, it might seem complex and cryptic at first, so let's break it down. To eliminate some confusion related to how I configured my shell prompts (the sets of characters "# " and "> "), let's see what the command looks like without them: <#g' linenums:0'>cat messyHTML.html | sed 's#<#\<#g' The two utilities cat and sed communicate with each other using a pipe (the character "|"). This pipe redirects the output of the cat command from where it would normally go (the terminal output) into the standard input of the sed command. Of course I could have done this using only the sed command by pointing it to the file, but I have chosen to use two commands for several reasons, one of which is to make the sed command as easy to understand as possible, which is not an easy task. The other reason was to give this tutorial some integrity. You will see what I mean later on, when I use sed one more time, but in a slightly more complex manner. You are already wondering why the command spans two lines. Well, try to use your imagination and put the two lines together again, but keep in mind that they are still separated by the binary representation of the carriage return. The imaginary command might look like this: linenums:0'>cat messyHTML.html | sed 's#<#\{binary representation of the carriage return}<#g' The carriage return is a special character that the bash shell will interpret according to its own specification, unless we "escape" it. Escaping a character means instructing the bash shell not to give that character any special mening and to treat it "as-is". Escaping is done in the bash shell by putting a backslash in front of the special character. Now we need to look at the sed command in more detail. sed is a stream editor that many people use primarily as a tool to mass replace patterned strings of text. A very basic example of how sed works is this: abcWWWWWfghi # linenums:0'># echo "abcdefghi" | sed 's#de#WWWWW#'abcWWWWWfghi# In this example I have used sed to replace the first occurence of "de" with "WWWWW" in the string "abcdefghi". Look at the following example where I append an additional occurence of "de" to the end of our string "abcdefghi" and run the same command again: abcWWWWWfghide # linenums:0'># echo "abcdefghide" | sed 's#de#WWWWW#'abcWWWWWfghide# If our objective is to replace ALL occurences of "de", we simply specify the "g" switch, which does a global replace. abcWWWWWfghiWWWWW # linenums:0'># echo "abcdefghide" | sed 's#de#WWWWW#g'abcWWWWWfghiWWWWW# On a side note, I personally tend to use the hash mark (the "#" character) as the separator in sed, because I often find myself needing to replace strings that contain slashes, but most people I have seen use the slash as a separator, like this: abcWWWWWfghiWWWWW # linenums:0'># echo "abcdefghide" | sed 's/de/WWWWW/g'abcWWWWWfghiWWWWW# Excellent !!! I'm only interested in links to pictures, so after quickly eyeballing the structure of the web page, I decide that I would only like to keep those lines that contain the string "images". > <#g' | grep images <a href=http linenums:0'># cat messyHTML.html | sed 's#<#\> <#g' | grep images<a href=http://forums.xisto.com/no_longer_exists/ 1<a href=http://forums.xisto.com/no_longer_exists/ 2<a href=http://forums.xisto.com/no_longer_exists/ 3<a href=http://forums.xisto.com/no_longer_exists/ 4<a href=http://forums.xisto.com/no_longer_exists/ 5<a href=http://forums.xisto.com/no_longer_exists/ 6<a href=http://forums.xisto.com/no_longer_exists/ 7# grep is a very powerful and very underutilized tool - "underutilized" in the sense that it's not utilized to its full potential: 99% of the time people only use 1% of its power. I will probably spend some time in a future instalment exploring its capabilities. It is that 1% that I am using here as well: basic filtering using plain (unpatterned) text. At this point we are one step away from reaching our objective and, as always in UNIX, there is more than one way of performing the next step. Let me explain. In this particular example, all the links that we are interested in begin at the same offset and have the same length, so one is strongly tempted to leverage this feature and use the cut command. For educational purposes, I will go ahead and show you how cut works, but in real life situations I always use an advanced feature of sed called backreferencing, which I will show you further below, so keep reading. Let's look at how our output lines are structured: <a href=http://forums.xisto.com/no_longer_exists/ 112345678901234567890123456789012345678901234567890123456789012345678901234567890 | | | | | | | 10 20 30 40 50 60 So where's the "beef"? Well, the "beef" begins at position 9 and ends at position 62. Let's perform a single test: # echo "<a href=http://forums.xisto.com/no_longer_exists/ 1" | cut -c9-62[url="http://forums.xisto.com/no_longer_exists/"&;# It works as expected, let's do the whole thing in one fell swoop: > <#g' | grep images | cut -c9-62 [url="http://forums.xisto.com/no_longer_exists/ _linenums:0'># cat messyHTML.html | sed 's#<#\> <#g' | grep images | cut -c9-62[url="http://forums.xisto.com/no_longer_exists/"&="http://forums.xisto.com/no_longer_exists/"&;# Excellent !!! Are we done? Well, yes and no, this was the lazy man's way, let me show you now the right way. > <#g' | grep images | sed 's#^.*\(http.*jpg\).*$#\1#' [url="http://forums.xisto.com/no_longer_exists/ _linenums:0'># cat messyHTML.html | sed 's#<#\> <#g' | grep images | sed 's#^.*\(http.*jpg\).*$#\1#'[url="http://forums.xisto.com/no_longer_exists/"&="http://forums.xisto.com/no_longer_exists/"&;# Does that look like command line garbage or what? Take a deep breath, I'm getting ready to explain to you that scary thing at the end: 's#^.*\(http.*jpg\).*$#\1#' But before I do, let us make a quick mental note that, although seemingly vastly more complex, this approach makes absolutely no assumptions whatsoever about where the links begin and where they end. The following is a regular expression pattern that matches the whole line (each and every line in its entirety): ^.*\(http.*jpg\).*$ It matches the whole line because it begins with the character "^" and it ends with the character "$". When used in a regular expression pattern, the character "^" always matches the beginning of the line and the character "$" always matches the end of the line. Note that I am making use of terms which I have not already explained, terms like "regular expression" and "pattern". If you want a formal definition for these terms, please feel free to google them, then come back to this tutorial. I prefer to define these entities by showing you how they are used in practice. When used in a regular expression pattern, a dot matches any character. A dot followed by an asterisk matches any string of characters, including the empty string. This is because, when used in a regular expression pattern, an asterisk matches zero or more occurences of the character preceding it. For instance, the pattern a* will match any one of the following strings: - the empty string - a - aa ... - aaaaaaaaa etc. ... you get the idea The regular expression pattern: ^.*\(http.*jpg\).*$ makes use of four anchors, two of which we have already discussed (beginning and ending of line). The other two are: "(http" and "jpg)" Here I am showing the parentheses without their respective preceding backslashes, but keep in mind that these parentheses are special characters that need to be escaped so that the bash shell leaves them alone and they therefore become available for use by sed. These two anchors "(http" and "jpg)" mean that whatever text matches the pattern "http.*jpg" inside the parentheses will temporarily be stored in a special location in sed's memory called a backreference. From that moment on, the character string stored in that special location in memory can be referenced by the name of "\1". sed provides a maximum of nine such backreferences, \1 through \9. Backreferences \2 and above become available as soon as you have more than one set of parentheses. It should now be obvious that the command sed 's#^.*\(http.*jpg\).*$#\1#' entirely replaces each and every line from the standard input with the string we want. One more caveat to pattern matching: sed's regular expression engine is "greedy", meaning that, should there have been several http.*jpg links on one line, the backreferencing trick would only partially provide the expected results. Let's give an example using a slightly modified version of the web page: # cat test.html<html><head><title>Fun Pics</title></head><body><a href=http://forums.xisto.com/no_longer_exists/ 1</a><br><a href=http://forums.xisto.com/no_longer_exists/ 2</a><br><a href=http://forums.xisto.com/no_longer_exists/ 3</a><br><a href=http://forums.xisto.com/no_longer_exists/ 4</a><br><a href=http://forums.xisto.com/no_longer_exists/ 5</a><br><a href=http://forums.xisto.com/no_longer_exists/ 6</a><br><a href=http://forums.xisto.com/no_longer_exists/ 7</a></body></html>### cat test.html | grep images | sed 's#^.*\(http.*jpg\).*$#\1#'[url="http://forums.xisto.com/no_longer_exists/"&="http://forums.xisto.com/no_longer_exists/"&="http://forums.xisto.com/no_longer_exists/"&="http://forums.xisto.com/no_longer_exists/"&="http://forums.xisto.com/no_longer_exists/"&="http://forums.xisto.com/no_longer_exists/"&;# In this case, the first and second patterns of ".*" both behave in a greedy manner, each one of them trying to "swallow" the character string: http://forums.xisto.com/no_longer_exists/ 6</a><br><a href= but precedence is given to the first. -
Debian Users, Please Post Your Source List Here
dserban replied to dserban's topic in Websites and Web Designing
Maybe I should have begun by stating my objective.I want to begin by downloading to an ftp location on my network all the files pertaining to ffmpeg and vlc, then, whenever I boot Knoppix on one of my PCs, I want to be able to quickly replace the sources.list file to point to this ftp location and run an "apt-get install ffmpeg vlc"I asked the question above because I would like to know what other interesting multiverse software people are downloading and installing on their Debian boxes, and try it out myself.