Category Archives: Uncategorized

Guardians of the Galaxy 3 May Start Filming As Soon As Next Year

James Gun revealed that the movie could start filming by the end of 2018.

Prior to its launch, Gunn made the announcement that he would return to direct the third installment after previously showing some hesitancy to return. He’s already been working on the script and hinted towards a 2020 release, and now revealed it could start filming by the end of 2018.

James Gunn recently spoke with Collider in honor of the upcoming home video release of Guardians of the Galaxy Vol. 2, which will be the first ever Disney movie to be released in 4K Ultra HD, as well as the standard Blu-ray/DVD release on August 22. During the course of the conversation, he said that it will be “a little more than year” until filming on Guardians of the Galaxy Vol. 3, which he says will probably be the title of the movie unless something changes, will begin. But he is currently writing the script and, as he tells it, things are going great. Here’s what director James Gunn had to say about it.

“It’s been pretty easy. The truth is, the first movie is the first act, the second movie is the second act, and the third movie is the third act so I’m tying a lot of stuff together in the third film. We get a lot of answers on a lot of different things, so doing that in an elegant way takes a little bit of grace and elegance. It’s more challenging in that respect writing the third movie than the second movie.”

Ever since it was officially confirmed that Gunn was returning for Vol. 3, there has been heavy speculation that Marvel will keep the Guardians in the prime May slot in 2020. However, Marvel’s production schedule has also typically meant a film starts production roughly a year ahead of its release. Gunn’s timetable does have plenty of leeway with a late 2018 or early 2019 start fitting his guidelines, but a start in 2018 could make a late 2019 release possible.

The final installment will follow the same thorough process Gunn goes through when developing each of his films. “I do an incredibly in-depth treatment for every movie,” Gunn said. “I think of writing a screenplay as creating the body of a human being and you’ve got to start with the skeleton, start with the bones, and you create the bones. You take a lot of time because that’s the actual base of the movie and if you screw that part up, later on down the line, you’re going to have a lot of mistakes. So by creating a really strong foundation for the story, that’s the most important thing. So I write a good treatment that’s probably about 70 pages long. It includes photographs and things like that. So that has been the way I’ve dealt with every movie and this one as well.”

The Guardians of the Galaxy will meet up with Thor before joining the rest of the Avengers in Infinity War, due out May 8, 2018. There’s no projected release date for Guardians of the Galaxy Vol. 3.


The Dark Tower: See What Critics Are Saying

Reviews of the sci-fi blockbuster, released in the U.S. Friday, range from scathing to downright obliterating.

The embargo has lifted for reviews on Sony Pictures’ The Dark Tower and the critical consensus can charitably be summed up with one word: meh. The Dark Tower is currently rated a lowly 18 percent on the divisive review aggregator Rotten Tomatoes. That’s far below the 51 percent scored by Luc Besson’s ambitious sci-fi thriller Valerian and the City of a Thousand Planets, one of the summer’s biggest box office misfires.

Directed and co-written by Nikolaj Arcel, the film stars Idris Elba as Roland Deschain, a gunslinger on a quest to protect the Dark Tower, a mythical structure which supports all realities, and Matthew McConaughey as his nemesis, Walter o’Dim, the “Man in Black”.

While it will be very interesting to see how the film performs at the box office this weekend, it’s clear that The Dark Tower won’t be winning any Academy Awards. Here are what the critics are saying.

IGN – Marty Silva

“The deeply flawed and compellingly tragic characters that King created are one-dimensional in their on-screen adaptations because the motivations that give them that depth are completely lost to the wind. That’s not to say the performances are bad – in fact, I absolutely adore the casting of the leads. […] But there’s no meat on the bone of the script for arguably two of the finest actors of our time to really dig in and give us something we haven’t seen before.”

Birth Movies Death – Scott Wampler

“The Dark Tower is a deeply flawed movie. It’s a film that feels rushed and plodding, sometimes within the same scene. It’s a film that saddles two of our greatest working actors with clunky dialogue and muddled motivations. It’s a film that feels claustrophobic and oddly contained when it should’ve felt sweeping and epic. After decades of waiting, after months of keeping our fingers crossed and hoping for the best, it brings me zero pleasure to report that The Dark Tower doesn’t really work.

Indie Wire – Kate Erbland

“Fans of King’s books will likely be disappointed by the way this long-awaited film adaptation speeds through essential plot points and frantically introduces characters with little in the way of rhythm or care, all in service of a rushed finale that will leave plenty scratching their heads. A tight story is one thing, but a 95-minute feature that is unable to give even the slightest inkling that it’s based on a grand-scale epic masterpiece is something else entirely. The whole universe is at stake here, but “The Dark Tower” wastes precious little time before it delivers any big moments, mostly care of listless action sequences that barely get moving before they’re cut short.”

The Wrap – Dan Callahan

“Most of the scenes in “The Dark Tower” feel like a desperate compromise of some kind, and often there seem to be scenes missing that would simply get us from one point to another. With fantasy material like this, we need to be made to believe in the inventions and the conceits, and we cannot do that if they are shot and staged in such a truncated and perfunctory way.”

THR – John DeFore

“Though far from the muddled train wreck we’ve been led to expect, this Tower lacks the world-constructing gravitas of either the Tolkien books that inspired King or the franchise-launching movies that Sony execs surely have in mind. Though satisfying enough to please many casual moviegoers drawn in by King’s name and stars Idris Elba and Matthew McConaughey, it will likely disappoint many serious fans and leave other newbies underwhelmed.”

Collider – Matt Goldberg

“The Dark Tower doesn’t even really do us the courtesy of being laughably bad. That would take some level of ambition, which the movie studiously avoids at almost every turn. Instead, it simply exists, eager to be overlooked and forgotten. It’s a shame that this adaptation didn’t have the funding or the vision to be something remarkable because you can see glimmers of a more ambitious, exciting movie. Sadly, Arcel approaches the story with a flat, uninteresting style, never daring to challenge his audience, invest in his characters, or give us a reason to care. The Dark Tower doesn’t fall because of a child’s mind. It falls because it’s too embarrassed to stand.”

Stargate Returns with – Stargate Origins – Trailer Comic-Con International 2017 HD

Twenty years ago, Stargate SG-1 debuted on Showtime and transformed the single film into an enduring live-action TV franchise that spanned three series. And while it’s been six years since Stargate Universe came to an end, MGM is finally ready to open the iris for a brand new series.



At the Stargate SG-1 20th anniversary panel at Comic-Con International, MGM revealed that a new Stargate series, Stargate Origins, is slated to begin filming next month. The leading character will be Catherine Langford, the young woman who witnessed her father, Professor Paul Langford, uncover the Stargate in Giza in 1928, as seen in the film. In 1996, Catherine hired Daniel Jackson to translate the symbols on the Stargate, which marked the beginning of the larger story.

It’s not clear if Stargate Origins is a reboot of the TV series timeline or a retcon, as Stargate SG-1 established that Catherine did not go through the gate until much later in life, but she did lose her fiancée, Ernest Littlefield, to an early experiment with the gate. According to MGM, the new series will follow Catherine as she “embarks on an unexpected adventure to unlock the mystery of what lies beyond the Stargate in order to save Earth from unimaginable darkness. ”


Stargate Origins will be a ten-episode event series, written by Mark Ilvedson and Justin Michael Terry, with Mercedes Bryce Morgan lined up as the director. Since production is starting next month, casting news should be available soon. The other major reveal of the panel is that Stargate Origins will be produced directly for Stargate Command, a new online home for the franchise.


As you may have suspected, Stargate Command will be a subscription service and it will presumably host the previous TV series as well as other content from the franchise’s history. We wouldn’t be surprised if Stargate Origins is only the beginning of their plans for new content. But it may be some time before we learn more. Stargate Command isn’t scheduled to officially launch until this fall.

Are you excited about the return of Stargate? Let us know in the comment section below!


Net neutrality is in real jeopardy, and we’re banding together in support of strong net neutrality rules that give people the power to choose which websites and apps are best. Your internet service providers and some at the Federal Communications Commission want to change these rules and potentially limit your access to the best of the internet. So what can you do? Take action, and tell the FCC that you care about the open internet and competition online.

Do you like the internet? Great, we do too. A lot.

Cat GIFs, the sum-total of human knowledge at your fingertips, the most democratizing power the world has ever known. What’s not to like?

Unfortunately, some in DC want to fundamentally change the internet for the worse. Internet service providers — or ISPs — say they support the “principles” of net neutrality, but well…

Don’t worry, we’ve got you covered.

In reality, your ISP wants to roll back the current net neutrality rules and gain the ability to control your experience online.

Specifically they could prioritize websites and apps they own over the rest of the web…

“That’s a nice website you’ve got there. It’d be a shame if something happened to it.”

Gain the power to block websites…

Error 404 – Website Not Found (because your ISP blocked it).

Or even slow down your connection to content they don’t like…

Hello, are you there?

IA, our member companies, and the rest of the internet community have been carefully watching what’s happening in Washington

Juni Cortez has been watching too.

And we need you to speak out in support of net neutrality

Really, we can’t do this on our own. We need you, and the entire internet community, to speak up.

So what do I do now?

We think it’s important to write a personal comment to the FCC about why you support strong net neutrality rules, and what access to the entire internet means to you. We need as many comments as possible. So be unique!
Here’s how it works:

Think about what to say
1. Think about what to say. Need some help? Check out this Mashable article.
Click the express button
2.Click on the “+Express” buttonafter getting to the FCC site.
Fill out form and comment
3. Fill out the required info, leave your comment, then follow the prompts.

Ready? Tell The FCC To Keep Net Neutrality

More Information can be found here: 

WannaCry a Birthday Gift??

I woke up on the 12th of May, it was my birthday, and I looked on the news feed and saw a burst of articles regarding the WannaCry Ransomware that has swept across the globe.

In the last few days, a new type of malware called Wannacrypt has done worldwide damage.  It combines the characteristics of ransomware and a worm and has hit a lot of machines around the world from different enterprises or government organizations:

While everyone’s attention related to this attack has been on the vulnerabilities in Microsoft Windows XP, please pay attention to the following:

  • The attack works on all versions of Windows if they haven’t been patched since the March patch release!
  • The malware can only exploit those vulnerabilities it first has to get on the network.  There are reports it is being spread via email phishing or malicious web sites, but these reports remain uncertain.


Please take the following actions immediately:

  • Make sure all systems on your network are fully patched, particularly servers.
  • As a precaution, please ask all colleagues at your location to be very careful about opening email attachments and minimise browsing the web while this attack is on-going.


The vulnerabilities are fixed by the below security patches from Microsoft which was released in Mar of 2017, please ensure you have patched your systems:

Details of the malware can be found below.  The worm scans port TCP/445 which is the windows SMB services for file sharing:

Preliminary study shows that our environment is not infected based on all hashes and domain found:



MD5 hash:



Per Symantec, here is a full list of the filetypes that are targeted and encrypted by WannaCry:

  • .123
  • .3dm
  • .3ds
  • .3g2
  • .3gp
  • .602
  • .7z
  • .ARC
  • .PAQ
  • .accdb
  • .aes
  • .ai
  • .asc
  • .asf
  • .asm
  • .asp
  • .avi
  • .backup
  • .bak
  • .bat
  • .bmp
  • .brd
  • .bz2
  • .cgm
  • .class
  • .cmd
  • .cpp
  • .crt
  • .cs
  • .csr
  • .csv
  • .db
  • .dbf
  • .dch
  • .der
  • .dif
  • .dip
  • .djvu
  • .doc
  • .docb
  • .docm
  • .docx
  • .dot
  • .dotm
  • .dotx
  • .dwg
  • .edb
  • .eml
  • .fla
  • .flv
  • .frm
  • .gif
  • .gpg
  • .gz
  • .hwp
  • .ibd
  • .iso
  • .jar
  • .java
  • .jpeg
  • .jpg
  • .js
  • .jsp
  • .key
  • .lay
  • .lay6
  • .ldf
  • .m3u
  • .m4u
  • .max
  • .mdb
  • .mdf
  • .mid
  • .mkv
  • .mml
  • .mov
  • .mp3
  • .mp4
  • .mpeg
  • .mpg
  • .msg
  • .myd
  • .myi
  • .nef
  • .odb
  • .odg
  • .odp
  • .ods
  • .odt
  • .onetoc2
  • .ost
  • .otg
  • .otp
  • .ots
  • .ott
  • .p12
  • .pas
  • .pdf
  • .pem
  • .pfx
  • .php
  • .pl
  • .png
  • .pot
  • .potm
  • .potx
  • .ppam
  • .pps
  • .ppsm
  • .ppsx
  • .ppt
  • .pptm
  • .pptx
  • .ps1
  • .psd
  • .pst
  • .rar
  • .raw
  • .rb
  • .rtf
  • .sch
  • .sh
  • .sldm
  • .sldx
  • .slk
  • .sln
  • .snt
  • .sql
  • .sqlite3
  • .sqlitedb
  • .stc
  • .std
  • .sti
  • .stw
  • .suo
  • .svg
  • .swf
  • .sxc
  • .sxd
  • .sxi
  • .sxm
  • .sxw
  • .tar
  • .tbk
  • .tgz
  • .tif
  • .tiff
  • .txt
  • .uop
  • .uot
  • .vb
  • .vbs
  • .vcd
  • .vdi
  • .vmdk
  • .vmx
  • .vob
  • .vsd
  • .vsdx
  • .wav
  • .wb2
  • .wk1
  • .wks
  • .wma
  • .wmv
  • .xlc
  • .xlm
  • .xls
  • .xlsb
  • .xlsm
  • .xlsx
  • .xlt
  • .xltm
  • .xltx
  • .xlw
  • .zip

As you can see, the ransomware covers nearly any important file type a user might have on his or her computer. It also installs a text file on the user’s desktop with the following ransom note:


Predictive learning problems

From my previous post How to Teach a Computer to Distinguish Cats from Dogs

Predictive learning problems constitute the majority of tasks machine learning can
be used to solve today. Applicable to a wide array of situations and data types, in
this section we introduce the two major predictive learning problems: regression and


Suppose we wanted to predict the share price of a company that is about to go public (that is, when a company first starts offering its shares of stock to the public). Following the pipeline discussed in Section 1.1.1, we first gather a training set of data consisting of a number of corporations (preferably active in the same domain) with known share prices. Next, we need to design feature(s) that are thought to be relevant to the task at

machine learning graph 1
Figure 1.7 (top left panel) A toy training dataset of ten corporations with their associated share price and revenue values. (top right panel) A linear model is fit to the data. This trend line models the overall trajectory of the points and can be used for prediction in the future as shown in the bottom left and bottom right panels. hand. The company’s revenue is one such potential feature, as we can expect that the higher the revenue the more expensive a share of stock should be.2 Now in order to connect the share price to the revenue, we train a linear model or regression line using our training data.

The top panels of Fig. 1.7 show a toy dataset comprising share price versus revenue
information for ten companies, as well as a linear model fit to this data. Once the model
is trained, the share price of a new company can be predicted based on its revenue, as
depicted in the bottom panels of this figure. Finally, comparing the predicted price to the
actual price for a testing set of data we can test the performance of our regression model
and apply changes as needed (e.g., choosing a different feature). This sort of task, fitting
a model to a set of training data so that predictions about a continuous-valued variable
(e.g., share price) can be made, is referred to as regression.We now discuss some further
examples of regression.


Example 1.1 The rise of student loan debt in the United States

Figure 1.8 shows the total student loan debt, that is money borrowed by students to pay for college tuition, room, and board, etc., held by citizens of the United States from 2006 to 2014, measured quarterly. Over the eight year period reflected in this plot total student debt has tripled, totaling over one trillion dollars by the end of 2014. The regression line (in magenta) fit this dataset represents the data quite well and, with its sharp positive slope, emphasizes the point that student debt is rising dangerously fast. Moreover, if this trend continues, we can use the regression line to predict that total student debt will reach a total of two trillion dollars by the year 2026.

Figure 1.8
Figure 1.8  Total student loan debt in the United States measured quarterly from 2006 to 2014. The rapid increase of the debt, measured by the slope of the trend line fit to the data, confirms the concerning claim that student debt is growing (dangerously) fast. The debt data shown in this figure was taken from [46].

Example 1.2 Associating genes with quantitative traits

Genome-wide association (GWA) studies (Fig. 1.9) aim at understanding the connections between tens of thousands of genetic markers, taken from across the human genome of numerous subjects, with diseases like high blood pressure/cholesterol, heart disease, diabetes, various forms of cancer, and many others [26, 76, 80]. These studies are undertaken with the hope of one day producing gene-targeted therapies, like those used to treat diseases caused by a single gene (e.g., cystic fibrosis), that can help individuals with these multifactorial diseases. Regression as a commonly employed tool in GWA studies is used to understand complex relationships between genetic markers (features) and quantitative traits like the level of cholesterol or glucose (a continuous output variable).

Figure 1.9
Figure 1.9 – Conceptual illustration of a GWA study employing regression, wherein a quantitative trait is to be associated with specific genomic locations.


The machine learning task of classification is similar in principle to that of regression. The key difference between the two is that instead of predicting a continuous-valued output (e.g., share price, blood pressure, etc.), with classification what we aim at predicting takes on discrete values or classes. Classification problems arise in a host of forms. For example, object recognition, where different objects from a set of images are distinguished from one another (e.g., handwritten digits for the automatic sorting of mail or street signs for semi-autonomous and self-driving cars), is a very popular classification problem. The toy problem of distinguishing cats from dogs discussed How to Teach a Computer to Distinguish Cats from Dogs in  was such a problem. Other common classification problems include speech recognition (recognizing different spoken words for voice recognition systems), determining the general sentiment of a social network like Twitter towards a particular product or service, as well as determining what kind of hand gesture someone is making from a finite set of possibilities (for use in e.g., controlling a computer without a mouse). Geometrically speaking, a common way of viewing the task of classification is one of finding a separating line (or hyperplane in higher dimensions) that separates the two

Figure 1.10
Figure 1.10 – (top left panel) A toy 2-dimensional training set consisting of two distinct classes, red and blue. (top right panel) A linear model is trained to separate the two classes. (bottom left panel) A test point whose class is unknown. (bottom right panel) The test point is classified as blue since it lies on the blue side of the trained linear classifier.

classes of data from a training set as best as possible. This is precisely the perspective on classification we took in describing the toy example in Section 1.1, where we used a line to separate (features extracted from) images of cats and dogs. New data from a testing set is then automatically classified by simply determining which side of the line/hyperplane the data lies on. Figure 1.10 illustrates the concept of a linear model or classifier used for performing classification on a 2-dimensional toy dataset.


Example 1.3 Object detection

Object detection, a common classification problem, is the task of automatically identifying a specific object in a set of images or videos. Popular object detection applications include the detection of faces in images for organizational purposes and camera focusing, pedestrians for autonomous driving vehicles,4 and faulty components for automated quality control in electronics production. The same kind of machine learning framework, which we highlight here for the case of face detection, can be utilized for solving many such detection problems.
After training a linear classifier on a set of training data consisting of facial and nonfacial images, faces are sought after in a new test image by sliding a (typically) square window over the entire image. At each location of the sliding window, the image content inside is tested to see which side of the classifier it lies on (as illustrated in Fig. 1.11). If the (feature representation of the) content lies on the “face side” of the classifier the content is classified as a face.

figure 1.11
Figure 1.11 – To determine if any faces are present in a test image (in this instance an image of the Wright brothers, inventors of the airplane, sitting together in one of their first motorized flying machines in 1908) a small window is scanned across its entirety. The content inside the box at each instance is determined to be a face by checking which side of the learned classifier the feature representation of the content lies. In the figurative illustration shown here the area above and below the learned classifier (shown in black on the right) are the “face” and “non-face” sides of the classifier, respectively.


Next up will be Feature designs.

Tips to a faster computer

The Geekiest One

Does your computer feel slower than it used to be? Does it take longer to start up or for programs to load? If so, chances are your computer has accumulated some “digital dust” and needs a little spring cleaning.
To better understand what causes your computer to slow down over time (and what you can do about it), here are ten sources of “digital dust.” The tips are based on a blog post Agent Wiebusch did a couple years ago on reasons your computer may be running slow. I have updated the advice a bit.
1) Too many programs running at the same time.Over the lifespan of a computer it is common for users to download programs, applications and other data that ends up “running in the background.” Many of these programs start automatically and you may not be aware they are open. The more things that run in…

View original post 980 more words

How to Teach a Computer to Distinguish Cats from Dogs

To teach a child the difference between “cat” versus “dog”, parents (almost!) never give their children some kind of formal scientific definition to distinguish the two; i.e., that a dog is a member of Canis Familiaris species from the broader class of Mammalia, and that a cat while being from the same class belongs to another species known as Felis Catus. No, instead the child is naturally presented with many images of what they are told are either “dogs” or “cats” until they fully grasp the two concepts. How do we know when a child can successfully distinguish between cats and dogs? Intuitively, when they encounter new (images of) cats and dogs, and can correctly identify each new example. Like human beings, computers can be taught how to perform this sort of task in a similar manner. This kind of task, where we aim to teach a computer to distinguish between different types of things, is referred to as a classification problem in machine learning.

1. Collecting Data

Like human beings, a computer must be trained to recognize the difference between these two types of an animal by learning from a batch of examples typically referred to as a training set of data.

cats and dogs.PNG
Figure 1.1

Figure 1.1 shows such a training set consisting A training set of six cats (left panel) and six dogs (right panel). This set is used to train a machine learning model that can distinguish between future images of cats and dogs. The images in this figure were taken from [31]. of a few images of different cats and dogs. Intuitively, the larger and more diverse the training set the better a computer (or human) can perform a learning task since exposure to a wider breadth of examples gives the learner more experience.

2. Designing features

Think for a moment about how you yourself tell the difference between images containing cats from those containing dogs. What do you look for in order to tell the two apart? You likely use color, size, the shape of the ears or nose, and/or some combination of these features in order to distinguish between the two. In other words, you do not just look at an image as simply a collection of many small square pixels. You pick out details, or features, from images like these in order to identify what it is you are looking at. This is true for computers as well. In order to successfully train a computer to perform this task (and any machine learning task more generally) we need to provide it with properly designed features or, ideally, have it find such features itself. This is typically not a trivial task, as designing quality features can be very application dependent. For instance, a feature like “number of legs” would be unhelpful in discriminating between cats and dogs (since they both have four!), but quite helpful in telling cats and snakes apart. Moreover, extracting the features from a training dataset can also be challenging. For example, if some of our training images were blurry or taken from a perspective where we could not see the animal’s head, the features we designed might not be properly extracted.
However, for the sake of simplicity with our toy problem here, suppose we can easily extract the following two features from each image in the training set:

1. size of nose, relative to the size of the head (ranging from small to big);
2. shape of ears (ranging from round to pointy).

Examining the training images shown in Figure 1.1, we can see that cats all have small noses and pointy ears, while dogs all have big noses and round ears. Notice that with the current choice of features each image can now be represented by just two numbers:

cats and dogs 2
Figure 1.2

Feature space representation of the training set where the horizontal and vertical axes represent the features “nose size” and “ear shape” respectively. The fact that the cats and dogs from our training set lie in distinct regions of the feature space reflects a good choice of features.

A number expressing the relative nose size, and another number capturing the pointiness or round-ness of ears. Therefore we now represent each image in our training set in a 2-dimensional feature space where the features “nose size” and “ear shape” are the horizontal and vertical coordinate axes respectively, as illustrated in Figure 1.2. Because our designed features distinguish cats from dogs in our training set so well the feature representations of the cat images are all clumped together in one part of the space, while those of the dog images are clumped together in a different part of the space.

3. Training a Model

Now that we have a good feature representation of our training data the final act of teaching a computer how to distinguish between cats and dogs is a simple geometric problem: have the computer find a line or linear model that clearly separates the cats from the dogs in our carefully designed feature space.1 Since a line (in a 2-dimensional space) has two parameters, a slope, and an intercept, this means finding the right values for both. Because the parameters of this line must be determined based on the (feature representation) of the training data the process of determining proper parameters, which relies on a set of tools known as numerical optimization, is referred to as the training of a model.

Figure 1.3 shows a trained linear model (in black) which divides the feature space into cat and dog regions. Once this line has been determined, any future image whose feature representation lies above it (in the blue region) will be considered a cat by the computer, and likewise, any representation that falls below the line (in the red region) will be considered a dog.

cats and dogs 3
Figure 1.3

A trained linear model (shown in black) perfectly separates the two classes of animal present in the training set. Any new image received in the future will be classified as a cat if its feature representation lies above this line (in the blue region), and a dog if the feature representation lies below this line (in the red region).

cats and dogs 4.PNG
Figure 1.4

A testing set of cat and dog images also taken from [31]. Note that one of the dogs, the Boston terrier on the top right, has both a short nose and pointy ears. Due to our chosen feature representation, the computer will think this is a cat!

4. Testing the Model

To test the efficacy of our learner we now show the computer a batch of previously unseen images of cats and dogs (referred to generally as a testing set of data) and see how well it can identify the animal in each image. In Figure 1.4 we show a sample testing set for the problem at hand, consisting of three new cat and dog images. To do this we take each new image, extract our designed features (nose size and ear shape), and simply check which side of our line the feature representation falls on. In this instance, as can be seen in Figure. 1.5 all of the new cats and all but one dog from the testing set have been identified correctly.

cats and dogs 5
Fure 1.5

Identification of (the feature representation of) our test images using our trained linear model. Notice that the Boston terrier is misclassified as a cat since it has pointy ears and a short nose, just like the cats in our training set.

The misidentification of the single dog (a Boston terrier) is due completely to our choice of features, which we designed based on the training set in Fig. 1.1. This dog has been misidentified simply because of its features, a small nose, and pointy ears, match those of the cats from our training set. So while it first appeared that a combination of nose size and ear shape could indeed distinguish cats from dogs, we now see that our training set was too small and not diverse enough for this choice of features to be completely effective.

To improve our learner we must begin again. First, we should collect more data, forming a larger and more diverse training set. Then we will need to consider designing more discriminating features (perhaps eye color, tail shape, etc.) that further help distinguish cats from dogs. Finally, we must train a new model using the designed features, and test it, in the same manner, to see if our new trained model is an improvement over the old one.


Let us now briefly review the previously described process, by which a trained model
was created for the toy task of differentiating cats from dogs. The same process is used
to perform essentially all machine learning tasks, and therefore it is worthwhile to pause
for a moment and review the steps taken in solving typical machine learning problems.
We enumerate these steps below to highlight their importance, which we refer to all
together as the general pipeline for solving machine learning problems, and provide a
the picture that compactly summarizes the entire pipeline in Figure 1.6.

cats and dogs 6
Figure 1.6

The learning pipeline of the cat versus dog classification problem. The same general pipeline is used for essentially all machine learning problems.

0 Define the problem. What is the task we want to teach a computer to do?
1 Collect data. Gather data for training and testing sets. The larger 
and more diverse the data the better.
2 Design features. What kind of features best describe the data?
3 Train the model. Tune the parameters of an appropriate model on the
training data using numerical optimization.
4 Test the model. Evaluate the performance of the trained model on the 
testing data. If the results of this evaluation are poor, re-think 
the particular features usedand gather more data if possible.

Starting my Udacity Journey

Udacity progress.PNG

Today I have decided to start my Distance learning Journey with Udacity, I have worked my way through numerous MOOC sites, such as FutureLearn, Open, Edx and COursera and have not looked at Udacity until now.

Each program Udacity offers is designed to help you achieve goals, meet objectives, and succeed in your life and career. Whether you have a specific job in mind or want to learn specific skills, the best way to decide is to envision your desired outcome, and then select the path that will get you there. Sometimes this is easy—you want to build Android apps, you take the Android Developer program! But if you’re not sure, they can guide you in the right direction. Thier blog is a great resource for career pathing, and you can always email them at Where they will learn about your interests, your goals, and your experience, and make personalized recommendations on the best ways to move forward.

For starters, I will be doing the various free courses:


After that, I plan to take the Intro to Programming Nanodegree. Where I will gain a refresher on web development and get a solid foundation on python and it’s syntax. Through this course, I will also build a coding portfolio. This course currently costs $299 (normal price $399).

Once I complete this I will have a few options. I am leaning more towards Artificial Intelligence and Machine Learning Engineering.

Further Courses I am looking at:

Follow me on my journey and progress.


Along with all this, I will be doing side courses at Edx and Coursera, Continue blogging about cyber security and continue with my Open University Degree.



How to prepare for PWK/OSCP, a noob-friendly guide

Few months ago, I didn’t know what Bash was, only heard of SSH tunneling, no practical knowledge. I also didn’t like paying for the PWK lab time without using it, so I went through a number of resources till I felt ready for starting the course.

Warning: Don’t expect to be spoon-fed if you’re doing OSCP, you’ll need to spend a lot of time researching, neither the admins or the other students will give you answers easily.

1. PWK Syllabus
1.1 *nix and Bash
1.2 Basic tools
1.3 Passive Recon
1.4 Active Recon
1.5 Buffer Overflow
1.6 Using public exploits
1.7 File Transfer
1.8 Privilege Escalation
1.9 Client-Side Attacks
1.10 Web Application Attacks
1.11 Password Attacks
1.12 Port Redirection/Tunneling
1.13 Metasploit Framework
1.14 Antivirus Bypassing
2. Wargames
2.1 Over The Wire: Bandit
2.2 Over The Wire: Natas
3. Vulnerable VMs

1. PWK Syllabus:

Simply the most important reference in the list, it shows the course modules in a detailed way. Entire preparation I did was based on it. Can be found here.

1.1 *nix and Bash:

You don’t need to use Kali Linux right away, a good alternative is Ubuntu till you get comfortable with Linux.

1. Bash for Beginners: Best Bash reference IMO.
2. Bandit on Over The Wire: Great start for people who aren’t used to using a terminal, aren’t familiar with Bash or other *nix in general. Each challenge gives you hints on which commands you can use, you need to research them.
3.  Explainshell: Does NOT replace man pages, but breaks down commands easily for new comers.

1.2 Basic tools:

You will use these tools a lot. Make sure you understand what they do and how you can utilize them.

Netcat: Most important tool in the entire course. Understand what it does, what options you have, difference between a reverse shell and a bind shell. Experiment a lot with it.
Ncat: Netcat’s mature brother, supports SSL. Part of Nmap.
Wireshark: Network analysis tool, play with it while browsing the internet, connecting to FTP, read/write PCAP files.
TCPdump: Not all machines have that cute GUI, you could be stuck with a terminal.

1.3 Passive Recon:

Read about the following tools/techniques, experiment as much as possible.

1. Google dorks
2. Whois
3. Netcraft
4. Recon-ng: Make sure you check the Usage guide to know how it works.

1.4 Active Recon:

  • Understand what DNS is, how it works, how to perform forward and reverse lookup, what zone transfers are and how to perform them. Great resource here.
  • Nmap: One of the most used tools during the course (if not the most). I’d recommend to start by reading the man pages, understand different scanning techniques and other capabilities it has (scripts, OS detection, Service detection, …)
  • Services enumeration: SMTP, SNMP, SMB, and a lot others. Don’t just enumerate them, understand what they’re used for and how they work.
  • Great list for enumeration and tools.

1.5 Buffer Overflow:

Most fun part in my opinion. There are countless resources on how to get started, I’d recommend Corelan’s series. You probably need the first part only for PWK.

1.6 Using public exploits:

Occasionally, you’ll need to use a public exploit, maybe even modify the shellcode or other parts. Just go to Exploit-db and pick one of the older more reliable exploits (FTP ones for example). The vulnerable version is usually present with the exploit code.

1.7 File Transfer:

Not every machine has netcat installed, you’ll need to find a way around it to upload exploits or other tools you need. Great post on this is here.

1.8 Privilege Escalation:

A never ending topic, there are a lot of techniques, ranging from having an admin password to kernel exploits. Great way to practice this is by using Vulnhub VMs for practice. Check my OSCP-like VMs list here.

Windows:Elevating privileges by exploiting weak folder permissions
Windows: Privilege Escalation Fundamentals
Windows: Windows-Exploit-Suggester
Windows: Privilege Escalation Commands
Linux: Basic Linux Privilege Escalation
Linux: LinEnum
Practical Windows Privilege Escalation
MySQL Root to System Root with UDF

1.9 Client Side Attacks:

Try out the techniques provided in Metasploit Unleashed or an IE client side exploit.

1.10 Web Application Attacks

Another lengthy subject, understand what XSS is, SQL injection, LFI, RFI, directory traversal, how to use a proxy like Burp Suite. Solve as much as you can from Natas on Over The Wire. It has great examples on Code Injection, Session hijacking and other web vulnerabilities.

Key is research till you feel comfortable.

1.11 Password Attacks:

Understand the basics of password attacks, difference between online and offline attacks. How to use Hydra, JTR, Medusa, what rainbow tables are, the list goes on. Excellent post on this topic here.

1.12 Port redirection/tunneling:

Not all machines are directly accessible, some are dual homed, connected to an internal network. You’ll use such techniques a lot in non-public networks. This post did a great job explaining it.

1.13 Metasploit Framework:

Decided to skip this part, but if you still want to study it, check out Metasploit Unleashed course.


1.14 Antivirus Bypassing:

Skipped this part too.

2. Wargames

Use them as a prep for vulnerable machines.

2.1 Over The Wire: Bandit

Great start for people who aren’t familiar with Linux or Bash.

2.2 Over The Wire: Natas

Focused on web application, many challenges aren’t required for OSCP, but it helps for sure.


Has great challenges on privilege escalation, SQL injection, Javascript obfuscation, password cracking and analyzing PCAP files

3. Vulnerable Machines

Boot-to-root VMs are excellent for pentesting, you import a VM, run it and start enumerating from your attacking machine. Most of them result in getting root access. Check the post on which machines are the closest to OSCP, there is also the .

Blog posts regarding my journey through

Pentestit Lab v10 – Introduction & Setup
Pentestit Lab v10 – The Mail Token
Pentestit Lab v10 – The Site Token